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CONTROLLING BIAS IN MAIL QUESTIONNAIRES* 


Joun A, CLAUSEN 
Cornell University 
AND 
Rosert N. Forp 
American Telephone and Telegraph Company 


In all instances where the mail questionnaire is used, one 
must be prepared to deal with the problem of bias due to non- 
response. Attention is given to two aspects of the problem: 
(1) maximizing response by every possible means in order to 
cut down the size of the non-respondent group whose char- 
acteristics and attitudes are unknown; and (2) making 
allowances or corrections for any bias that may exist in the 
incomplete returas. 

In mail follow-ups of veterans who had not responded to 
the initial questionnaire, personalized salutation and true 
signature did not lead to significant increases over non- 
personalized forms in rate of response, but the use of special 
delivery letters markedly increased returns. A multiphasic 
survey, covering several potentially interesting topics, yielded 
higher rates of response than a single subject survey of the 
same population, and also greatly lessened an interest bias in 
response. 

Successive waves of response may give an informal basis for 
estimating bias among those who do not respond even after 
several follow-ups. The view that mail surveys of a homo- 
geneous population are not seriously affected by bias is re- 
futed by data drawn from such surveys. 


ost of those who comment on the use of mail questionnaires still 
maintain the position accepted in the 1930’s that mail question- 


naires are of little or no value. Perhaps the Literary Digest fiasco, which 
resulted from non-corrected bias in a mail survey, closed the issue for 
many. For example, McNemar recently damned the mail questionnaire 
unequivocally as a source of attitude data: “Mailed schedules or the 


* A paper presented at the 106th Annual Meeting of the American Statistical Association, Atlantic 


City, N. J., Jan. 24, 1947. The data are drawn from mail surveys carried out by the staff of the Research 
Service, Coordination and Planning, Veterans Administration, of which both authors were members 
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telephone may satisfy the less scrupulous but such methods should be 
taboo.”! It is our belief that one need be neither unscrupulous nor 
naive to find the mail survey a satisfactory technique for securing atti- 
tude data in certain situations. Its uses may perhaps be somewhat 
more subject to limitation or in need of supplementation than the use 
of certain other techniques, but no one method of obtaining data is 
most satisfactory for all types of studies. Consideration of the require- 
ments of the research problem should determine the technique to be 
used, and for certain problems, the mail technique has decided advan- 
tages which outweigh its disadvantages. 

Since there have been a number of excellent statements of both ad- 
vantages and disadvantages, there is no need to go into details here.? 
The alleged advantages involve considerations of economy, possibility 
for wide geographical distribution, elimination of interviewer bias, pos- 
sible gain in validity by assurance of anonymity, suitability for reach- 
ing important people and other classes difficult to reach, greater care 
by the individual in making his responses, possibility for family or other 
group consultation prior to reply, and elimination of call-backs. 

The disadvantages are principally the incurring of a number of risks 
as a result of lack of close control over the circumstances of response. 
There are risks of bias due to non-response, of response from other than 
the person to whom addressed, of group response where an individual 
response is desired, or of consulting sources of data where level of in- 
formation is to be tested. There are limitations imposed by the relative 
inflexibility «i a questionnaire or again by the lack of control in admin- 
istration of the questionnaire—limitations with respect to probing in 
areas of special interest to the respondent, or inability to control the 
sequence in which the respondent reads and answers questions, etc. 
Field work for a mail survey may consume an excessive amount of 
time if several follow-up letters are used. This is not only an important 
consideration because it delays the final report, but also because the 
facts being investigated may change during the period of the field work. 
Early and late returns may be incomparable because of this. 

It will be apparent that certain aspects of the use of the mail ques- 
tionnaire will be judged as advantages for one study and as disad- 
vantages for another. In some instances, as where probing is required 
or close control is desired over the conditions under which a question- 
naire is administered, the mail questionnaire is clearly ruled out. In all 

1 McNemar, Quinn, “Attitude-Opinion Methodology” Psychological Bulletin, Vol. 43, 1946, p. 328. 
2 See for example Lundberg, G. A. Social Research, New York, 1942, Chapter VII; Benson, L. E. 


“Mail Surveys Can Be Valuable,” Public Opinion Quarterly, Vol. 10, 1946, pp. 234-35: Blankenship, 
A. B., Consumer and Opinion Research, New York, 1943, pp. 43-47. 
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instances where the mail questionnaire is used, however, one has to be 
prepared to deal with the problem of bias due to non-response. It is the 
main topic of this paper. 

There are two aspects to the problem: (1) maximizing response by 
every means possible in order to cut down the size of the non-respond- 
ent group whose characteristics and attitudes are unknown; (2) making 
allowance or corrections for any bias that may exist in the incomplete 
returns. The basic objective is, of course, to secure returns in the proper 
proportion from those strata of the population which must be repre- 
sented in the sample desired, or, as a substitute, making reasonably 
accurate estimates of the distribution of non-respondents by char- 
acteristics and by attitude classification. 


MAXIMIZING RESPONSES 


The maximum possible response rate is always desired in order to 
minimize the importance of response bias. What may be expected? The 
answer depends upon the characteristics of the population to be sur- 
veyed, upon the intrinsic interest to that population of the subject 
matter of the survey, upon the prestige of the survey auspices, and 
upon the success of the questionnaire and the covering letter in enlist- 
ing interest to the fullest possible extent. 

Surveying segments of the veteran population, we have regularly 
achieved 80-90 per cent returns when two follow-up letters were used. 
Experience with the veteran population, under the auspices of a gov- 
ernment agency charged with aiding veterans, cannot, of course, be 
glibly generalized to other populations. On the other hand, it seems 
likely that at least some of the factors making for return of the ques- 
tionnaire are the same, regardless of survey subject and population 
differences. Moreover, numerous users of the technique report returns 
in excess of 50 per cent—(with several in excess of 90 per cent) using 
one or more follow-ups.* 

What are the ways by which the rate of returns may be raised? There 
is surprisingly little in the way of systematic evidence on the subject in 
the literature. What there is comes largely from marketing and ad- 
vertising journals, with Sletto’s systematic investigation a notable 
exception. Sletto’s research suggests that (for a population of college 

* See for example Reid, S., “Respondents and Non-respondents to Mail Questionnaires,” Educa- 
tional Research Bulletin, Vol. 21, 1942, pp. 87-96 (95 per cent response from Ohio school principals); 
Sletto, R., “Pretesting of Questionnaires,” American Sociological Review, Vol. 5, 1940; pp. 193-200 (69 
per cent response from coilege alumni); Stanton, F., “Notes of the Validity of Mail Questionnaire Re- 
turns,” Journal of Applied Psychology, Vol. 23, 1939; pp. 95-104 (50 per cent and 94 per cent returns 


to two mail surveys); Suchman, E. A. and McCandless, B., “Who Answers Questionnaires?” Journal of 
Applied Psychology, Vol. 24, 1940: pp. 758-69 (48 per cent and 95 per cert returns to two mail surveys) 
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alumni) the factor of length is relatively unimportant, at least as be- 
tween questionnaires 10 and 25 pages in length; an altruistic appeal to 
give information which would help others may be slightly more effec- 
tive than a challenging appeal, and a postcard follow-up was as effec- 
tive as a letter containing the same statement. A number of writers 
state that it is desirable to personalize the covering letter as much as 
possible. Seitz‘ reports that use of regular postage stamps on envelopes 
produces a higher rate of returns than does use of prepaid or metered 
mail. He also strongly advises the use of a separate covering letter with 
the questionnaire, rather than printing the letter on the questionnaire. 

We have systematically used follow-ups in all mail surveys and have 
experimented to some extent on the effects of using extra postage (air 
mail and special delivery), and of personalizing the covering letter in 
securing a higher rate of return. 

Form and Content of the Covering Letter. Few hard and fast rules can 
be made about the covering letter. Our general preference and practice 
has been to use a separate uncrowded sheet for the letter, but one 
survey for which the letter was printed on the front of a 3 page ques- 
tionnaire drew 88 per cent returns after two follow-ups. 

In general we have tried to indicate, in simple language, the purposes 
of the survey, the importance of the views of the addressee as a revre- 
sentative of many persons, and the uses to which the data will be put. 

We have preferred a reasoned appeal to a short “ punchy” type of 
letter, even to cross sections of the veteran population where less than 
half of the group had finished high school. A short “punchy” letter drew 
only 23 per cent response in a survey of attitudes toward National 
Service Life Insurance, while a longer, more reasoned appeal drew 36 
per cent response from veterans who had failed to answer the initial 
appeal. 

Occasionally the letters have run as long as 300 words in order to 
make all the points that were considered relevant to motivating the 
respondent. 

Personalization and Extra Postage. An experiment to test the relative 
drawing power of letters differing in degree of personalization and the 
effects of using extra postage was conducted as part of the follow-up 
to the survey on attitudes toward, and information about, National 
Service Life Insurance (NSLI). The initial mailing, which used a short 
letter addressed to “Dear Veteran,” and bearing the facsimile signa- 
ture of General Bradley, drew only 23 per cent returns from a cross 


‘ Seitz, Richard M., “How Mail Surveys May be Made to Pay,” Printers’ Ink, 209, 9, Dec. 1, 1944, 
pp. 17-19. 
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section of veterans who had held insurance while in service. Partly to 
get some indication of bias in these initial returns, and partly to test 
the relative efficiency of personalized salutation (name, address and 
Dear Mr. ) and true signature as opposed to the impersonal saluta- 
tion (no address and Dear Veteran) and facsimile signature, a sub- 
sample of non-respondents to the initial mailing was divided systemati- 
cally into five mailing groups. Each group was sent the same basic 
follow-up letter along with another copy of the questionnaire, but with 
the following differences in salutation, signature and type of postage: 








Type of Number 





Group Letter Form Postage of Cases 
1 Impersonal salutation, personal signature Franked 300 
2 Impersonal salutation, facsimile signature Franked 300 
3 Personal salutation, facsimile signature Franked 300 
4 Personal salutation, personal signature Franked 400 
5 Personal salutation, personal signature Air mail and 
special delivery 400 





The signature used in the follow-up, whether true or facsimile, was 
that of the Director of Research Service rather than General Bradley. 
Five weeks after the mailing of the follow-up letters, completed ques- 
tionnaires had been received from 41 per cent of the group which failed 
to respond to the initial mailing. The response rate for groups 1 to 4 
was 36 per cent. The differences in rate among these four groups are 
well within the range of fluctuations which would be expected to occur 
by chance alone. Thus, there is no evidence that the personalized 
salutation or personal signature caused a significant increase over the 
non-personalized forms in rate of response. 

Among those veterans who received air mail-special delivery letters, 
however, the rate of response was 61 per cent, a marked and clearly 
significant increase over receipts from veterans who received franked 
envelopes. (A small scale test showed no reason to expect a higher rate 
of return from government mail when a 3¢ stamp rather than franked 
postage was used.) Moreover, nearly three-fifths of the questionnaires 
returned by Group 5 in answer to the follow-up were received within 
one week after mailing, while only two-fifths of the returns from groups 
1 to 4 were received within this period. The data on rate of response 
are summarized in Table 1. 

Because personalization of the follow-up letter was not instrumental 
in raising returns, one cannot conclude that personalization of the ini- 
tial letter would likewise have made no difference in response rate. The 
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mere fact of follow-up is in itself a personalizing device. It indicates 
that someone is really interested in the subject’s response, interested 
enough to take note of the absence of that response. Therefore, even a 
“Dear Veteran” follow-up letter may be interpreted as a direct personal 
appeal. The use of extra postage apparently “registers” the urgency 
with which a response is desired much better than does any personaliz- 
ing within the letter. 

On the other hand, the use of extra postage on the initial mailing 


TABLE 1 


RESPONSE RATES TO SINGLE FOLLOW-UP BY EXTENT OF PERSONALIZATION 
AND TYPE OF MAILING (INSURANCE SURVEY) 








Additional Additional Re- 





Group — Prana am - Returns in turns Received 
. - . Second Week after Two Weeks 
1 Franked 34 
2 Franked 36 
3 Franked 40 16 ue ° 
4 Franked 36 
5 Air mail and 
special delivery 61 35 20 6 





might well give less gain over regular postage than was obtained from 
its use on the follow-up. A study conducted by the Troop Attitude Re- 
search Branch of the War Department supports this belief. The sample 
to be surveyed was divided systematically, half to receive regular 
postage and half special postage on the initial mailing. Returns were 
received from 71 per cent of the regular postage group and from 74 per 
cent of the special postage group. The study was obviously an inter- 
esting one to the population being surveyed, and it is possible that a 
wider difference might be obtained in studies where interest runs con- 
siderably lower. Even if this were the case, however, the added atten- 
tion-getting power of special postage would probably be lost for sub- 
sequent mailings. 

The Follow-up. The importance of the follow-up in maximizing re- 
turns will be apparent from the foregoing. Whether the questionnaire is 
ostensibly anonymous® or whether identification is clearly called for, 
the follow-up letters in these surveys of veterans have generally stated 
that no reply to the initial mailing has been received from the ad- 
dressee, have emphasized the importance of his reply and have briefly 
repeated the gist of the previous letter. Another copy of the question- 

5 A serial number stamped or typed under the survey designation has been used for purposes of 


identification. Less than one-tenth of one per cent of the respondents have removed or defaced the num- 
ber. 
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naire and another return envelope (franked) have routinely been en- 
closed with the follow-up letter. In most of our mail surveys, two follow- 
up mailings are routinely planned. The extent to which returns have 
been increased in five surveys of different segments of the veteran pop- 
ulation are given in Table 2. When the follow-up appeal is not mark- 
edly different from the initial appeal, it appears that the proportion of 
previous non-respondents induced to reply to a follow-up is roughly the 
same as the response rate to the initial mailing. 
TABLE 2 


CUMULATIVE RECEIPTS FROM INITIAL MAILING AND SUCCESSIVE 
FOLLOW-UPS, FIVE SURVEY GROUPS (PER CENT) 











Male Male Applicants for 
Female 

Separatees Separatees innit Educational Amputees 

(July, 1945) (Dec. 1945)  “Paratee Eenefits 
Initial Mailing 55 45 51 54 66 
First Follow-up 76 73 74 77 88 
Second Follow-up 88* 87* 87* 88* 94% 

Number of cases (1,812) (3,209) (1,771) (14,606) (1,594) 





* Extra postage used. 


In all of the above instances the follow-up letters were mailed after 
intervals of from ten days to two weeks. Thus about six weeks elapsed 
between the initial mailing and the cut-off date for final returns. If 
there is a greater time pressure than this, there are at least two alterna- 
tives. Mailing intervals may be cut down by perhaps a half, or inter- 
viewers may be employed to make personal contacts among non- 
respondents. Experience in using a markedly shorter interval with ten 
per cent of the December separatees (who received the initial mailing a 
month later than the rest of the sample) suggests that the only draw- 
back is a clerical loss; questionnaires will go needlessly to some who are 
in the process of responding. 

Hansen and Hurwitz* have recently described a technique combining 
use of a mail questionnaire with interviews of a sample of non-respond- 
ents in order to attain specified precision at minimum costs. Their de- 
scription assumed only a single mailing, but the same technique may be 
employed when a mail follow-up is used to maximize returns prior to 
intervewing. Costs of attaining the required precision may be greatly 
reduced even in instances where the follow-up brings far less gain in 
response than the instances cited in Table 2. Moreover, unless one has 


* Hansen, M. H. and W. N. Hurwitz, “The Problem of Non-response in Sample Surveys,” Journal 
American Statistical Association, Vol. 41, 1946, pp. 517-29. 
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at his disposal a large, well distributed field staff, the time required 
for interviewing a larger number of non-respondents after a single 
mailing may be considerably greater than the time required for a mail 
follow-up and subsequent interviewing of fewer non-respondents. 

In instances when a final return rate of 80 to 90 per cent is secured 
after two or more follow-ups, the problem of bias due to non-response 
may be so slight as to obviate the need for interviewing non-respond- 
ents, especially if returns to successive waves of follow-up letters are 
analyzed for clues to the nature and extent of response bias. 

Maximizing Interest—M ultiphasic Questionnaires. Often the subject 
of a survey will be of interest to only a small segment of the survey 
population, yet the attitudes of the total population toward that sub- 
ject are desired. It may be possible to use a multiphasic questionnaire 
which contains items of considerable general interest as a means of 
increasing returns. For example,! a single sheet containing the basic 
questions of interest in the insurance survey was included as a supple- 
ment to a four-page questionnaire dealing with post-separation em- 
ployment experience and housing needs of veterans. The initial mailing 
of the combined or multiphasic survey brought 42 per cent response, as 
against 23 per cent response for the original insurance survey alone 
which used a slightly shorter questionnaire. 

Even more important, the multiphasic form markedly cut down 
response bias, apart from its influence in raising returns. The regular- 
mail follow-up to the single subject insurance survey brought the total 
response to this questionnaire up to 50 per cent. These returns, how- 
ever, were disproportionately weighted with men who were maintain- 
ing their National Service Life Insurance—over 40 per cent of the 
respondents were in this category as against an estimated 25-30 per 
cent of the original sample (estimated from VA operating statistics). 
With only 42 per cent response to the initial mailing of the multiphasic 
questionnaire which contained the crucial questions on insurance, how- 
ever, the proportion of respondents who were maintaining their insur- 
ance was found to be 29 per cent. Thus, responses to the multiphasic 
form, being induced by other factors than interest in insurance, ex- 
hibited far less bias with respect to their insurance plans and also, of 
course, less bias in characteristics associated with an interest in main- 
taining insurance (age, marital status, etc.) 

Another bit of evidence in the effect of questionnaire length is af- 
forded by our experience with the multiphasic approach. Most of the 
mail questionnaires used in the surveys of veterans have been from 3 
to 6 pages in length. In several instances one or two pages of supple- 
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mentary questions have been sent out with the initial mailing of one- 
fourth or one-half of the basic questionnaires. In no instance was the 
rate of return appreciably diminished by inclusion of the supplemen- 
tary questions. This would seem to support Sletto’s findings that length 
is not of itself a crucial influence on response rates. 


ESTIMATING RESPONSE BIAS 


The problem of bias is, of course, the crucial problem in the use of 
mail questionnaires. Regardless of the techniques used for collecting 
the data, biases in survey results may arise from many sources, as 
Deming has so clearly pointed out.’ There may be bias of the auspices, 
a conscious or unconscious slanting of responses because of attitudes 
toward the agency or organization sponsoring the survey; there may be 
bias due to imperfect design of the questionnaire, bias arising from 
non-response or omissions, bias arising from late reports, from unrep- 
resentative selection of the date of the survey or unrepresentative 
selection of the sample. 

In the mail survey, bias arising from non-response is usually a major 
problem. What types of response bias are most often encountered? 
Suchman and McCandless found that returns are received in highest 
proportion from individuals interested in the subject of the survey 
and, in addition, from the better educated.* Pace suggests that whether 
or not a person will return a questionnaire will depend on “interest; 
conscientiousness, habits of promptness; time available, pleasurable 
association with the source of the questionnaire; sufficient lack of em- 
barrassment with one’s present status to be willing to report that 
status.”® An instance in which the last named source of bias seems to 
have been particularly important is reported by Shuttleworth, who 
found in a study of the employment status of alumni that non-respond- 
ents were much more likely to be unemployed.'® Franzen and Lazars- 
feld state that “mail questionnaires are answered more often by people 
who, due to their educational and occupational background, more 
easily express themselves in writing, and by people who are more inter- 
ested in the topic under discussion." 

In the mail surveys which we have conducted, we have generally, 


7 Deming, W. E., “On Errors in Surveys,” American Sociological Review, Vol. 9, 1944, pp. 350-69. 

5’ Suchman, E. A. and McCandless, B., op. cit. 

* Pace, C. R., “Factors Influencing Questionaire Returns from Former University Students,” 
Journal of Applied Psychology, Vol. 23, 1939, p. 388. 

10 Shuttleworth, F. K. “Sampling Errors Involved in Incomplete Returns to Mail Questionnaires,” 
Journal of Applied Psychology, Vol. 25, 1941, pp. 588-91. 

1 Franzen, R., and Lazarsfeld, P. F., “Mail Questionnaire as a Research Problem,” Journal of 
Psychology, Vol. 20, 1945, p. 294. 
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although not always, found evidence of a higher response rate from the 
better educated veteran than from the less we!l educated. Further, if 
changing characteristics noted in returns to successive follow-ups may 
be taken as evidence as to the characteristics of the residual group of 
non-respondents, we have found a higher response rate from those 
interested in the subject of the survey than from other veterans. The 
use of the follow-up to provide clues as to the characteristics of the 
non-respondent group has, in fact, been basic to our use of mail surveys. 


TABLE 3 


FINAL RESPONSE RATES IN THREE MAIL SURVEYS, BY EDUCATIONAL 
LEVELS WITHIN SAMPLE 











ete Cat July 1945 December 1945 Educational 
a Army Separatees Army Separatees Plans Survey 
All 88 88 88 
Grade school only Ss 81 78 
Some high school 87 81 82 
High school graduate 89 90 91 
Some college and college graduate 90 92 94 
Number of cases (1,812) (2,475) (14,606) 





Educational Bias. Out of three surveys for which data on educational 
level were available for all members of the initial sample, two showed 
significant response bias with respect to education (Table 3). The ques- 
tionnaires sent to July and December Army separatees were very sim- 
ilar, and the men were in each instance a cross section of those being 
discharged at Army separation centers. The survey of July separatees, 
however, was conducted by the War Department while that of Decem- 
ber separatees was conducted by the Veterans Administration. One 
hypothesis for the lack of educational bias in the earlier survey is that 
the less well educated men were more likely to feel they had to respond 
to an inquiry from the War Department than to one from the Veterans 
Administration. 

Interest Differentials as Shown by Successive Waves of Replies. If re- 
turns from successive mailings are tabulated separately, the trends in 
returns will often aid in estimating the characteristics of those still 
missing. A good example is afforded by a survey to determine the 
number of veterans likely to enroll in schools or training establishments 
in the Fall of 1946 or some subsequent date. 

The over-all rate of returns was 88 per cent. As would be expected, 
there was a much higher initial response from veterans planning to 
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attend school or take training than from those not interested in utilizing 
the benefits for which they had applied earlier. The degree of bias is 
indicated by receipts to successive mailings (Table 4). 

The estimate for non-respondents is clearly in the nature of a guess, 
but it is an informed guess. It would certainly be unreasonable to as- 
sume that non-respondents were more likely to be attending or plan- 
ning to attend school than respondents to the second follow-up. On the 
other hand, the fact that a considerable number of veterans who were 


TABLE 4 


EDUCATIONAL PLANS REPORTED IN ANSWER TO INITIAL MAILING AND TWO 
FOLLOW-UPS, AND ESTIMATE FOR TOTAL POPULATION USING 
ESTIMATE FOR NON-RESPONDENTS (PER CENT) 











a Status Weighted Initial First Second yg ag 
ioe Total Mailing Follow-up Follow-up Respendente 
In school or training 37 42 35 29 25 
Definitely planning to enroll 44 46 44 41 40 
Considering enrolling 14 10 15 21 25 
Not planning to enroll 5 2 6 y 10 
ToTaL 100 100 100 100 100 
Weighting (per cent of total 
sample represented) 100 54 23 11 12 


Number of cases: 14,606 





definitely planning to attend school failed to answer the initial ques- 
tionnaire and the first follow-up, but were led to respond to the second 
follow-up suggests that there are probably still a number planning to 
attend among non-respondents. This might be true especially among 
the fraction (about 2 per cent of all) whose questionnaires were re- 
turned by the post office as undeliverable. 

In the example reported in Table 4, the high rate of returns after 
two follow-up letters somewhat limits the importance of the estimate 
for non-respondents. In the insurance survey previously referred to, 
however, the use of the single follow-up employing special postage to 
give an estimate for non-respondents yielded a much more important 
gain, as will be noted in Table 5. 

Estimating Bias in a Changing Characteristic—Employment Status. 
The analysis of response bias by employment status of veterans re- 
cently discharged from the Armed Forces was complicated by the rapid 
change of status as veterans were reabsorbed into the civilian economy. 
Here was an instance where early returns showed relatively high unem- 
ployment while returns to the follow-ups received two weeks to a 
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TABLE 5 


MAINTENANCE OF N.L.S.I. AMONG RESPONDENTS TO INITIAL MAILING AND 
SPECIAL DELIVERY FOLLOW-UP AND ESTIMATE FOR TOTAL SURVEY 
SAMPLE AFTER MAKING ALLOWANCE FOR NON-RESPONDENTS 
(PER CENT) 














x i N = 

Status of Weighted Initial Speen ne 
Insurance Total Maiing Delivery Respondents 

Follow-up (Estimate) 

Insurance in force 25 49 22 15 

Insurance lapsed 75 51 78 85 
ToTaL 100 100 100 100 

Weighting (per cent of total 

sample represented) 100 23 47 30 





month later, showed much less unemployment—a reversal of the usual 
trend from early to late returns found in a stable situation. Could the 
differential between early and late returns be explained wholly on the 
basis of changing status, or was there a tendency for unemployed veter- 
ans to respond in higher proportion, thinking they might thereby more 
readily receive assistance, or, still a possibility was the usual bias of 
higher returns from the employed actually operating but merely being 
concealed by sharp changes in status? 

Since it was recognized in advance that the problem of rapidly chang- 
ing employment status would create a problem for the combining of 
early and iate returns, the questionnaire asked not only for a report of 
current status but also for a report of the date on which the veteran 
first touk a job or entered school after his discharge. From the data thus 
secured, it was possible to plot separately the curves of entry into school 
or employment for respondents to each wave of questionnaires. For 
each of the surveys of separatees, the curves for respondents to the 
follow-ups very nearly coincided with that for respondents to the ini- 
tial mailing.” 

Further corroboration of the lack of any significant bias in the re- 
porting of employment status was provided by following up through 
personal interviews nearly half of the July separatees who failed to 
respond to the mail questionnaire. No significant differences were found 
between yesponses of men who answered the final mail follow-up and 
those who were interviewed. These instances of the failure of an em- 
ployment bias to appear are important because they are apparently 

12 The technique of anchoring to the date of change of status, responses with respect to a changing 
characteristic, and then using the follow-up waves to estimate the existence of bias, would seem to be 


adaptable to a number of situations where objective data are involved but is not to be recommended 
or use in attempting to estimate bias with respect to attitudes that are known to be in state of transition. 
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exceptions to the generalization frequently reached by others. They 
suggest that the existence of such a bias will depend upon the attitudes 
of the members of the survey group toward reporting themselves un- 
employed to the originators of the survey—attitudes that may range 
from embarrassment and hesitancy to reply, on the one hand, to hope 
for assistance and positive motivation to reply on the other. 

The Problem of Bias in a “Homogeneous” Group. At least two recent 
articles on the use of the mail survey have suggested that response 
bias may be relatively unimportant, even when the proportion of re- 
turns is quite low, if one is surveying a homogeneous group. No defini- 
tion of a homogeneous group is offered in either article but three exam- 
ples are cited: Time subscribers, people in Who’s Who, and lawyers. It 
is stated that “results for returns from a small proportion of such a 
group would tend to be more like the whole than is the case when tastes 
or opinions within the group to be sampled are very different,” and, 
“such biases as exist are perhaps of no great importance.“ This may 
be true if one is surveying the group with respect to some subject of 
approximately equal interest (whether high interest or low interest) to 
most members. Ownership of various items, or reading or radio listen- 
ing habits, or voting plans may not differ significantly between re- 
spondents and non-respondents in a group relatively homogeneous 
with respect to education and occupation. 

But define a homogeneous group by any criteria other than the chart- 
acteristic which is to be estimated and there will still be cleavages 
within. the group. If one wishes to conduct a survey on a subject of 
considerable interest to some members of a group and of relatively 
little interest to others, then no matter how homogenecus the group is 
with respect to education or income or reading habits, the individuals 
most interested in the subject of the survey will respond in higher 
proportion than those not interested and a serious response bias may 
occur. An example is given by Seerly Reid, who surveyed public school 
principals in Ohio (a group certainly as homogeneous as lawyers or 
Time subscribers) on the use of radio in their schools." Starting with 42 
per cent replies to the initial mailing and using successive follow-ups 
until 95 per cent of the cases had been heard from, he found that the 
true proportion of schools owning and using radio equipment was 
greatly overstated in early returns. Even when 69 per cent of the sample 
group had replied, there was a substantial bias in the basic estimate 
desired. 

'S See Franzen and Lazarsfeld, op. cit., p. 293 and Benson, L, E., op, cit., p. 237. 


“ Benson, op. cit., p. 237. 
‘8 Reid, op. cit. 
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The crux of the problem of working with homogeneous groups is 
this: How do you know they are homogeneous with respect to the 
relevant characteristics (including interests) until you have made a 
study of a representative segment of the group? 

Weighting for a Bias Within a Bias. Another fairly common fallacy 
is that when differential response occurs from various segments of the 
population as judged by certain known criteria, one need only weight 
the mail results according to the true distribution for the universe. Such 
weighting will, in general, help to reduce bias. If, for example, in a sur- 
vey of voting plans it is known that those who voted for a particular 
candidate in the previous election are overrepresented, one will obvi- 
ously gain in the accuracy of an estimate of voting intentions by proper 
weighting. But very often there are biases in response due to interest 
differentials within the various strata of the population that one wishes 
to have properly weighted. If, for example, Democrats are not only 
underrepresented, but the Democrats in the sample are not repre- 
sentative of all Democrats in the population surveyed, one may still be 
seriously in error after weighting to correct for the obvious bias in 
party affiliation. Whenever differential interest is a primary determinant 
of unequal response rates within the various segments, one may be 
lulled into a false sense of security by weighting which does not take 
such differential interest into account. Even where weighting helps to 
reduce one type of bias, it must not blind the worker to the fact that 
other sources of bias exist. 

An example of a “bias within a bias” that cannot be corrected simply 
by weighting to obtain proper representation from each stratum is af- 
forded by the survey of veterans’ attitudes toward National Service 
Life Insurance. As earlier reported, responses were heavily over- 
weighted from veterans who were maintaining their insurance in force. 
Insofar as one might be interested in presenting data on attitudes with 
respect to National Service Life Insurance, proper weighting of the 
replies of veterans keeping their insurance and those who had already 
lapsed their insurance was clearly called for. 

But such weighting would go only part way toward eliminating re- 
sponse bias. For among veterans returning questionnaires who reported 
they had not kept up their payments, there was a further bias—returns 
were more often made by veterans interested in reinstating their insur- 
ance. Thus among the 51 per cent who reported they had not kept up 
their payments in answer to the initial mailing, 22 per cent expressed 
some interest in reinstating. Among respondents to the special de- 
livery follow-up, however, only 15 per cent of those not up to date in 
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their payments expressed interest in reinstating. Among those who had 
still not responded, it is likely that not more than 10 per cent of those 
whose payments had not been kept up to date would be interested in 
reinstating. Accordingly, an estimate of about 15 per cent seems most 
likely as the true proportion of those interested in reinstating their in- 


surance. 
TABLE 6 


ESTIMATE OF PROPORTION OF VETERANS INTERESTED IN MAINTAINING 
INSURANCE, USING WEIGHTING AND FOLLOW-UP DATA TO 
REDUCE BIAS (PER CENT) 








Dropping Insurance 











Total 
Keeping Desire to Reinstate Keeping or 
Insurance Desiring to 
Total = & of those % of Total Reinstave 
Dropping Group 
Unweighted returns to initial 
mailing 49 51 22 11 60 
Weighted returns to initial 
mailing 23 77 22 17 40 
Weighted returns using follow- 
up for reducing bias in pro- 
portion willing to reinstate 23 77 15 1l 34 





When the double bias is adjusted simultaneously, estimates of the 
proportion of men who are definitely interested in either maintaining or 
reinstating their insurance drop from 60 per cent, the unweighted fig- 
ure, to 34 per cent. The results are shown in Table 6 to illustrate the 
need for this double correction. (These figures are now probably out of 
date, owing to extensive changes in the laws governing veterans’ in- 
surance.) 

In conclusion, it must be emphasized that while there are often close 
resemblances in the types of bias encountered in different surveys, each 
survey must be considered in a sense unique. Careful consideration, in 
advance, of the probable extent cf interest biases and their effects on 
the distribution of other characteristics of the respondent group may 
suggest the inclusion of control questions that will be helpful in making 
informal adjustments of the sort here reported. 




































COORDINATING THE MEASUREMENTS OF 
RADIO LISTENING* 


Hans ZEI1seL 
McCann-Erickson, Inc. 


VEN the expert is at times puzzled by the variety of yardsticks 
K which set out to measure the many aspects of radio listening: 
there is the “coincidental” rating and “share of audience,” there are 
“turn-over” figures and data on “total listening hours” and there are 
“BMB” and other station or network “coverage” measurements. 

This rather haphazard enumeration is meant to reflect on the not 
too systematic way in which these measurements have been devel- 
oped: they were built mostly around certain techniques of collecting 
the basic data rather than around a basic concept of what ought to be 
measured. The recall method, the coincidental telephone method, the 
automatic recorder, the diary, those are the techniques around which 
measurement systems have been created. 

Under these circumstances it seemed desirable to see whether these 
various measurements could not be linked together in some intelligible 
way and, hence, permit a more systematic coordination. 

The search for such a link was occasioned by a recent controversy 
in the radio industry abcut the proper way to measure station cov- 
erage; by that is meant the extent to which a radio station is heard 
within a certain territory. One group wanted to define it as the total 
listening volume (that is the product of the number of homes listening, 
times the number of hours listened) actually devoted to one station, 
as compared with other stations in that area. The other group wanted 
to define coverage as the proportion of radio homes in an area who 
listen at least once a week (the particular limit is, of course, arbitrary) 
to any given station. Although this latter definition has been accepted 
by the industry’s agency, the Broadcast Measurement Bureau (BMB 
for short), the debate about the merits of this definition has not quite 
abated. An investigation of this problem seemed to lead to a general 
link that permits a systematic coordination of the various radio meas- 
urements. 

What is the very element of all radio listening? The act of listening 
which connects an individual, or a home, with a radio station at any 
given time. Suppose we consider the amount of listening done by 50 


* Paper given before the Annual Meeting of the American Statistical Association, Atlantic City, 
1947. 
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families during a time interval of 100 units, as measured along the 


— vertical axis of the array in Graph I, 

— The abscissa lists the 50 families, the height of each bar indicates 
eard the time this family was listening to the program or station. 

otal From this graph several of the familiar indices of radio measure- 
ing, ment and one or two less familiar ones can be read off: 

re #1. The Total Audience: the proportion of families listening at all 
ited to that broadcast (50%). 

who #2. The Average Listening Time per Listening Home: the mean of 
iry) all bars excluding the non-listeners, that is bars with height 
ted | zero (663% of the total time period under surveillance). 

- “ #3. The Average Listening Time for All Homes: the mean of all 
ute bars including those with height zero (333% of the time period). 
eral #4. The Audience Turn-Over: the ratio between measures 1 and 3 
_ (1.5). 

; #5. Certain measurements of Variability of listening, such as the 
—s average deviation of all bars with heights larger than zero. 

4 From our graph one can easily perceive the connection between these 


five measurements. One can also see more clearly what certain indices 
City, do and do not indicate. Measurement #3, for instance, is tantamount 
to the so-called “coincidental” measurement as produced by C. E. 
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Hooper.! Our graph makes it quite clear that this coincidental measure- 
ment, for instance, can never tell you whether the average which 
bears this title is achieved by the intensive listening of a few homes or 
a spotty listening of many or all homes. One will realize the implica- 
tions of this statement: even if Hooper’s coincidental measurement 
were accurate—which it is not?—it would never measure the propor- 
tion of homes exposed to a broadcast. 
Our next step will give further proof of the usefulness of analyzing 
this type of listening pattern, because from the same type of listening 
array we can also determine the exact place of the two concepts of 
station coverage, mentioned previously: if we think of Graph I not as 
referring to one broadcast, but rather to a station or network, then our 
Measurement #1, the Total Audience, is nothing but the BMB-meas- 
urement of coverage. Measurement #3, on the other hand, represents 
the other, the volume-concept of measuring station coverage. 
After that, there can be little doubt as to the legitimacy of both 
coverage measurements. However, the controversy as to the relative 
merits of both measurements is still open. It can be decided, obviously, 
only on practical grounds, by determining which coverage concept 
supplies a better tool for solving our practical problems. Among these 
problems there is one which, because of its fundamental character, 
will provide a good test for the two coverage concepts. It is the ques- 
tion which lies at the bottom of all coverage problems: Which of two 
radio stations, broadcasting in a given territory, will deliver more lis- 
tening homes or, if one is also interested in the financial aspects of the 
operation, which one will deliver more listening homes per dollar? 
The following schematic data will show how BMB-data can be used 
for the solution of this problem: 
1. Territory A contains 150,000 radio families. 
2. BMB indicates that 80 per cent or 120,000 of these families listen 
at least once a week to Station X. 

3. BMB further indicates that 40 per cent or 60,000 families listen 
at least once a week to Station Y. 

4. One hour radio time on Station X costs $400, one hour on Station 
Y costs $150. 

On the basis of these data one will arrive at the simple conclusion 
that Station X delivers 30,000 families per $100, whereas Station Y 

1 Hooper, however, derives these 33} per cent by a slightly different method: he determines for 
every time unit the proportion of all families who listen to the radio. The end result remains the same; 
he will find that on the average for the entire time period, 33} per cent of all families listened on the radio. 


2 Cf. Hans Zeisel, “The Coincidental Audience Measurement,” Proceedings of the Sixteenth Instt- 
tute for Education by Radio, Ohio State University Press, Columbus, 194€, pp. 387-99. 
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delivers 40,000 families per $100. Hence, Station Y will appear as the 
better buy. 

Let us now recall the limitations of this measurement: we know 
that the stated numbers of homes will actually listen to that station. 
But what we do not know from BMB is whether one station may be 
not listened to more frequently than the other. In other words, we only 
know the number of families who will listen, we do not know how much 
they will listen. 

It is only fair to point out in this connection that from a theoretical 
point of view the coverage measurements for magazines and news- 
papers, that is the printed media, have to cope with exactly the same 
problem: the circulation of a magazine measures—granted certain 
simplifying assumptions—the number of homes reached by that maga- 
zine. To what extent a magazine is read, and whether one magazine is 
read more than the other, we do not know merely from the circulation 
figures, which are, thus, the logical correlate of the BMB figures. Just 
as for radio, there are for the printed media two dimensions of coverage: 
the Total Audience (circulation), and the amount of time spent by the 
individual family or person on reading the particular newspaper or 
magazine. 

We turn now to the second concept of radio coverage, the station- 
share-of-audience, as previously defined. To see fully the practical 
significance of this concept we might keep in mind that this station- 
share-of-audience is nothing but an average of program-shares for any 
defined period of time. 

Graph II, based on Philadelphia listening data as recorded by The 
Pulse, shows the development of the station-share from the program 
share concept, by ever increasing the time period involved. 

The station-share, then, will answer this important question for 
men who buy or sell radio time: if the program his agency produces is 
not worse or better than the average program, what share of the total 
listening in this area can he expect? One can easily see at this point 
already the desirability of having these data in addition to the BMB 
figures. 

Since their procurement on the basis of another nation wide survey 
would incur major expense, the research department of McCann- 
Erickson set out to explore the possibility of developing the one 
coverage concept from the other: The station-share from the BMB 
data. This can be done if, and to the extent to which, the two measure- 
ments are related to each other. 

To see whether they are related, we availed ourselves of station 
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listening data from which both coverage measurements could be com- 
puted. Through the courtesy of Audience Measurement Incorporated, 
we received 1,000 complete family listening records for a full week’s 
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GRAPH II 


STATIONS’ SHARE OF AUDIENCE FOR VARYING TIME PERIODS 
(Philadelphia, The Pulse, Jan.—Feb. 1946) 


period. They were provided in the form of so-called listening diaries, 
into which a home enters, for every quarter hour of an entire week, the 
station, if any, to which it is listening. 

Five hundred of the diaries covered a block of 30 counties around 
Oklahoma City, the other 500 covered a block of 18 counties around 
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Atlanta, Ga. In order to have a minimum of 35 diaries for one area, 
some adjacent counties were laid together so that we had altogether 
10 distinct areas in the Oklahoma territory, and 11 such areas in 
Georgia. For each of these 21 areas and for all stations in these areas 
we were able to compute two measurements: 

1. The BMB data, that is the proportion of families listening at 

least once a week to a station. 

2. The Station-Share data, that is the proportion of the total listen- 

ing time devoted to each station. 

Actually we made two measurements of each kind: one for daytime 
listening and one for evening listening so as to parallel completely the 
official BMB measurements. 

Before presenting our conclusions, we would like to introduce a piece 
of evidence that will reaffirm our confidence in the technical soundness 
of the measurement operations involved. 

Graph III indicates the extent to which the coverage-measurements 
developed from our 1,000 diaries, according to the BMB-definition, 
correspond to the official BMB-percentages as developed for these 
stations and areas from the mail surveys conducted by the Broadcast 
Measurement Bureau. 

Each pair of bars represents the BMB measurement for one of the 
six stations in that area: the black bar represents the official BMB 
measurement, the white bar the BMB measurement developed inde- 
pendently from our diaries. One does not need an exact measurement 
of this comparison; mere inspection reveals an almost incredible paral- 
lelity. In the lower right hand corner the other, the station-share 
measurement for the six stations, is compared with an independently 
secured station-share measurement; again a very amazing parallelity. 

Unless one suspects both methods of collecting the basic data of 
being subject to the same bias, the above comparison should greatly 
increase our confidence in the accuracy—though not in the usefulness— 
of the basic measurement methods involved. 

At long last we will now present the relationship found between the 
two concepts of coverage measurement. Each point on the scatter 
diagram shown in Graph IV represents one station measurement 
(day or evening) for one of the 21 areas. Since we have 6 stations, in 
21 areas, one day and one evening measurement, we have a total of 
6X21 X2 that is 252 observations. 

The curves on this diagram are developed empirically through com- 
puted middle values. These curves fitted better than the computed 
exponential curves, for the use of which there is no theoretical indica- 


tion. 
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GRAPH IV 


APPROXIMATE RELATIONSHIP BETWEEN BMB PERCENTAGE (ABSCISSA) AND 
STATION SHARE PERCENTAGE (ORDINATE) 


The indicated relationship varies somewhat for daytime and night- 
time coverage, but the main character of the curves is identical. The 
following general statements can be made: 

1. BMB percentages below 50 correspond, with rare exceptions, to 

Station Share percentages below 10 per cent—which is a small 
share indeed. 
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2. The curve continuously increases its slope. A change of 1 BMB 
percentage point represents different changes in Station Share 
percentages: the higher up in the scale the change occurs, the 
bigger a Station Share will 1 BMB percentage point represent. 

3. The curve relating to BMB percentages below 50 is rather flat; 
the difference between BMB’s in the ‘teens and in the thirties, 
expressed in Station Share percentage points, is negligible. 

4, in the highest brackets of BMB, the corresponding differences in 
Station Share are rather great: any BMB percentage point 
above 80 represents more than one percentage point of Station 
Share. 

The general implication of these curves is, that all comparisons based 
on BMB data will be unduly biased in favor of the station with the smaller 
BMB coverage, because of this basic relationship: the greater, in any 
given area, the percentage of families who listen to a station, the more 
time, on the average, will each individual family spend on listening to 
this station. The following table, based on our listening diaries, illus- 
trates the basic structure of this relationship: 








Proportion of families Average number of hours 
Station listening at least once individual family listens 
a week to this station per week to this 
(BMB%) station after 6 p.m.* 

K 10 1} 

L 30 2 

M 50 23 

N 70 32 

oO 80 43 

P 90 53 

Q 100 7% 





* Average weekly listening time after 6 p.m. to all stations is (in this case) 124 hrs. 


It is this relationship which accounts for the curved character of the 
scatter diagram in Graph IV, and biases therefore the BMB measure- 
ments in favor of the smallerstations. Only if the listening time per family 
were not related to the BMB percentage, would the diagram in Graph 
IV be represented through a straight line and, hence, make BMB a 
correct index of station coverage. 

Finally, it may be remarked that the relationship presented in Graph 
IV can be conceived as a special case of a more general statistical rela- 
tionship: that between the area of a frequency distribution above a 
certain demarcation line, and the mean of that frequency distribution. 
The area represents the proportion of families listening more than no 
time to a given radio station, and the mean represents the average 
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length of time a family listens to that station; this time period is ex- 
pressed, in our example, as share of the average total time spent by 
one family on listening to all stations. 

In measuring the coverage of printed media, the same problem 
exists: what is called “readership” can be conceived as the area repre- 
senting the individuals who read a certain minimum (defined in time 
or printed space); and the mean amount of reading would correspond 
to our second coverage concept, the mean amount of listening. 

For purposes of actual translation of BMB percentages into Station 
Share percentages the following tabular form of the curve from Chart 
IV will be helpful. 


Translation Table 


BMB Percentcge Station Share Percentage 
Day Evening 
10 1 ] 
20 2 3 
30 4 5 
40 6 8 
45 7 10 
50 8 11 
55 11 13 
60 13 16 
65 16 18 
70 19 21 
75 23 25 
80 28 30 
82 30 32 
84 32 34 
86 34 37 
88 37 39 
90 40 42 
92 44 45 
94 47 49 
96 50 52 
98 54 55 
100 58 59 


Before discussing the practical significance of this translation table, 
a theoretical remark seems in order: we well know that this table 
is based on observations in only two independent territories. It is 
true that the two territories yielded almost identical relationships; it is 
for this reason that we built our table on the combined results from 
both tables. However, we are well aware of the fact, that different 
patterns of station competition will yield different translation tables. 
We venture, however, the guess that the deviations from our table 
will not be exorbitant. 
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What, now, is the practical significance of this translation table? 
It will become clear if we return, for a moment, to the practical prob- 
lem we outlined above in schematic form: we found 120,000 or 80 per 
cent of the families listening to Station X, and 60,000 or 40 per cent 
listening to Station Y. Underlying our decision that Station Y consti- 
tuted the better buy was the seemingly correct assumption that 120,000 
families represent twice as much listening as 60,000 families. Our trans- 
lation table enables us to refine this assumption by basing our decision 
not only on the number of listening families, but also on the time they 
will spend listening. From our table we find that a BMB of 80 per 
cent indicates a 30 per cent share of the total (evening) listening vol- 
ume, whereas a BMB of 40 per cent indicates not half as much, but 
only one fourth the listening volume, that is 8 per cent of the total 
radio listening time in that area. This illustrates the point made previ- 
ously, that the BMB measurement makes the smaller stations seem 
relatively more important than they actually are. Since Station X de- 
livers almost four times as much listening as Station Y, but costs less 
than three times as much ($400 against $150), we have to reverse our 
decision and conclude that Station X is the more profitable purchase. 

In our example, knowledge of the Station Share measurement re- 
versed the decision made only on the basis of BMB data; that need 
not be always the case. However, a decision based on Station Share 
data will always be closer to the true facts, because it considers both 
dimensions of listening—the number of families as well as the time 
each family devotes to radio listening. BMB considers only the first 
of these dimensions. 

The purpose of this little study was twofold: it was meant to be a 
contribution to the methods of measuring radio coverage, and we hope 
that our effort will be followed by many more of this kind, establishing 
different translation tables for different patterns of station competi- 
tion. 

But this study was meant to serve also as an example of what can be 
generally accomplished by studying patterns of listening distributions. 
It seems about time that we stop matching seemingly independent 
radio measurements against each other, measurements which turn out 
to be organically related, if the basic listening distributions are studied. 
To concentrate future research operations, not so much on the various 
individual measurements, but rather on the listening distributions from 
which they derive, might prove a promising enterprise: not only will 
we gain a better understanding of the things we now know, but this 
path may well lead us also to new fields of listening behavior. 
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A STRATIFIED-RANDOM SAMPLE OF A SMALL 
FINITE POPULATION 


F. G. CorRNELL 
U. S. Senate Committee on Labor and Public Welfare* 


This is the description of the sample plan for a survey of 
enrollments by mail questionnaire of 1800 colleges and uni- 
versities in the United States. 

The requirements of the survey included speed, accuracy, 
and advanced knowledge or reliability with decided limitations 
as to available staff time. Previous surveys on enrollments 
provided information on which to design a stratified-random 
sample, making use of the principle of optimum allocation. 


HOUGH there are fewer than 1800 colleges and universities in the 

United States, problems of non-response and inaccuracies in report- 
ing prevent quick and accurate surveys by complete coverage. Sampling 
had not been undertaken on the grounds that there were such small 
numbers of institutions in various categories that sampling would not 
pay. 

The results of a survey conducted by the U. S. Office of Educaion 
in the fall of 1946' proved this position to be untenable. In the early 
and summer months of 1946, a greatly accelerated demobilization of 
military personnel had increased applications for admission to college. 
Administrative decisions in several nation-wide college and university 
programs depended upon knowing with reasonable accuracy as soon 
after fall opening as possible the extent of expansion of enrollments in 
higher educational institutions. Sampling was feasible because lists 
of elements (institutions) were available, as were enrollment statistics 
for practically all institutions for previous years. The latter permitted 
gains in efficiency of design through stratification. Sampling was neces- 
sary because limited resources were available, and results were desired 
as quickly as possible. 

The purpose of the survey was to secure unbiased estimates of total 
enrollments of various types of students in each of six major classes of 
higher educational institution. Estimates of total enrollment of all 
students were to have coefficients of variation of .05 for each of the six 
classes. 


* Now Director of Research and Service, College of Education, University of Illinois. 
1 For report of survey see F. G. Cornell, “Higher Education in the Fall of 1946,” Journal of the 
American Association of Collegiate Registrars, 22: 147-58, January 1947. 
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TYPE OF ESTIMATES USED 

A modified stratified-random design was used.? Institutions listed in 

the Office of Education files or the latest directory defined the popu- 

lation. For most of the 1,749 institutions filed, there were also 1941- 

42 and 1943 fall enrollments from biennial surveys of higher education 


for those years. 
Estimates for each class of institution were computed as follows: 


on L N; ni 
S= > — > ri (1) 
f=) Me jel 
Where § is the estimated total enrollment of all institutions in all L 
strata of a particular class of institution, N; is the total number of in- 
stitutions in the 7th stratum, n; the number in the sample from the 
ith stratum, and z;; is the enrollment of the jth institution in the 7th 
stratum, 
STRATIFICATION 


The first step in designing the sample was grouping the 1,749 insti- 
tutions into eight type categories. The six categories of institution to 
be used in the final report were the basis for this classification. For sur- 
vey purposes, however, universities and junior colleges were split into 
two groups, each on the basis of source of control, public or private, 
thus producing the eight type categories as follows: 





Number of institutions | 
Type of institution Total (N) In sample (n 
a. Publicly controlled universities and large institu- 


tions of complex organization................. 69 23 

b. Privately controlled universities and large institu- 
tions of complex organization. ..............-. 62 20 
ce. Colleges of arts and sciences. ................- 557 53 
d. Independent technical and professional schools. . 287 72 
e. Teachers colleges and normal schools........... 201 57 
f. Publicly controlled junior colleges............. 246 30 
g. Privately controlled junior colleges............. 222 47 
ie RI I i. occ gdus-dxaweikdaie'e ww eco dwm 105 31 
I eid pba aos ance amas 1,749 333 


Each of the eight type categories was divided into five size strata, 
having approximately equal 1941-42 enrollment. For each type cate- 


2 See theory in reference [2] by Hansen and Hurwitz, and report of application in reference [1] 
by Deming and Simmons. Dr. Deming and Mr. Simmons were both of material assistance in the plan- 
ning and execution of the survey reported here. 
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gory in which there were institutions enrolling women only or men 
only two additional separate strata were established for such institu- 
tions. In categories in which there were institutions from which no 
1941-42 statistics had been received, there was an eighth stratum for 
such institutions. 

For example, the 201 teachers colleges and normal schools were 
stratified as follows: 

Number of 


Group institutions 
ee ee ee er ee 13 
SR ap ee ee ee epee Pe 18 
cia eectatieatiweseatanew 26 
ci akake et Ke eee taker 42 
ee ee 73 
Te 24 
e-8 (no enrollment data)........... 5 

PR ee 201 


The first five strata are based on size, the 13 largest institutions 
(1941-42 enrollment) in group e-1, the 73 smallest in group e-5, etc. 
Group e-6 is made up of the 24 institutions which have no men students 
enrolled in either 1941-42 or the fall of 1943. The last stratum con- 
sisted of all institutions for which useable statistics were not available 
for classification otherwise. 


DETERMINING THE NUMBER OF CASES REQUIRED 
Where 
1 & 
oi? = — Di (ty — 4)? 
Ni iat 


is the variance of enrollments of institutions in the 7th stratum for 
any date, the variance of the estimate in (1) is 





L (No? Ny — n; 
ont = aah 6 wenanuenienanene (2 
. Pat Ns N; —1 ) 
L N;32e;? L 
= >» _ >» N,o;?. (3) 
i=l nN j=l 


It is important to note that these formulas require the standard 
deviation, or variance, of the population, not that obtained from a 
sample. In the present case almost complete knowledge of population 








526 AMERICAN STATISTICAL ASSOCIATION 


variances was at hand; where not, it was estimated. Actually the 
analysis of 1941-42 and 1943 data was useful only as a means of ap- 
proximating characteristics of the desired population, namely, en- 
rollment of institutions in 1946. This assumed high correlations be- 
tween 1946 enrollments and enrollments for the earlier years. As will 
be seen, such approximations were useful in this survey in reducing the 
number of cases required to meet the desired reliability. 

According to the principle of optimum allocation,’ the fewest cases 
will be required to yield a given level of reliability if they are dis- 
tributed among strata in proportion to N,o;. That is, 


N; Nyoi 
one Gi) cusmmmsenne (4) 
n L 
: ¥ Nioi 
i=] 
or 
Nioi - 
N¢ 7 n (5) 
> Nios 
i=l 


where n is the number of institutions in the sample from all strata in 
the class, i.e., 
L 
r= > Ny. 
t=1 


The formula for determining n is easily derived from (3) and (5): 


(Ere) 


n= : ° (6) 
a2 + Do Nios? 


tan] 





In this survey the standard error was not set. Instead the coefficient of 
variation was set as follows: 


C.V. = o,/S = .05. 


Hence (6) above was applied in the following form: 


(Ena) 


r= . (7) 


L 
0025S? + >> Nio.? 


t=) 





* Originally advanced by Neyman [3]. 
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Though the stratification was based upon 1941-42 enrollment, the 
allocation using equations (5) and (7) was based upon enrollments in 
the fall of 1943. The reason for this was to make allowances for changes 
in variances within strata from year to year. By the method of strati- 
fication, strata were very homogeneous as to 1941-42 enroliment. 
Slightly higher relative variances would be expected in the fall of 
1946 to the extent that 1941--42 and 1946 enrollments did not correlate 
perfectly. It was assumed that the magnitude of relative variances of 
1943 data would be more like those for 1946 data for institutions 
grouped into 1941-42 size strata. This assumption proved to be correct. 

In Table 1 are standard deviations and coefficients of variation for 
the five size strata of teachers colleges and normal schools. Since 


TABLE 1 
COMPARISON OF STRATA VARIANCES IN ENROLLMENTS, 1941-42, 
FALL 1943, AND ESTIMATED FOR FALL 1946 
Group E—Teachers Colleges and Norma! Schools 





Fall 1946 











1941-42 Fall 1943 (estimated from sample) 
Stratum* a Pci ii eatahe oes ete ae 
Cr Fa c.Y. Cr z C.Y. Cz z C.Y. 

e-1 505 1,856 272 325 865 .376 1,271 2,200 .578 
e-2 109 1,278 085 190 515 369 251 1,638 .153 
e-3 114 866 132 189 387 .488 278 992 .280 
e-4 86 553 .156 82 251 .327 243 752 .323 
e-5 130 255 .510 86 158 .544 177 399 .444 





* Does not include other strata for which variances were not computed for all three periods. 


they were stratified according to 1941-42 enrollment, the coefficients 
are low for that year. The impact of wartime reductions in enrollment 
was variable. As may be expected, therefore, the coefficients for the 
fall of 1943 were higher, and in magnitude more like those actually 
found for the survey year 1946, even though the prewar return to large 
enrollments was made with much more uniformity within some strata, 
particularly e-2, and e-3, than in others, particularly e-1. The standard 
deviation of the largest stratum, e-1, tripled between 1943 and 1946 
as enrollments increased by 150 per cent. In contrast, in stratum e-2 
enrollments were three times as great in 1946 as in 1943, but the 
standard deviation increased only from 190 to 251. 

An examination of the sets of coefficients of variation for the three 
periods in Table 1 will show how greatly the 1946 situation would have 
been underestimated using 1941-42 variances, and that this was in 
part corrected by using variances of the 1943 enrollment data. 
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ALLOCATION PROCEDURE ILLUSTRATED 


The allocation procedure will be illustrated, again using the 201 
teachers colleges and normal schools. The 1943 fall total enrollment 
for the 201 institutions, S, was 56,472. In Table 2 are other data needed 
to solve equation (7). 


715,830,025 715,830,025 
r= = 
0025 X 56,472? + 4,638,948 7,974,976 + 4,638,948 


= 56.7. 








This number is distributed according to (5) by means of the Nye; 
values of strata (column 5 of Table 2). The result, after rounding to 
whole numbers, appears in column 7 of Table 2. 


TABLE 2 


ANALYSIS OF 1944 ENROLLMENT STATISTICS ON BASIS OF WHICH ALLOCATION 
OF CASES WAS MADE AMONG SIX STRATA OF TEACHERS 
COLLEGES AND NORMAL SCHOOLS 

















Stratum* ‘ Zs o% Niot Ngog? ng 
(1) (2) (3) (4) (5) (6) (7) 
e-1 13 865 325 4,225 1,373 ,268 9 
e-2 18 515 190 3,420 649 ,854 7 
e-3 26 387 189 4,914 930 ,072 10 
e-4 42 252 82 3,444 285,138 7 
e-5 7 159 86 6,192 33 ,520 13 
e-6 24 163 190 4,560 867 ,096 10 





Total — _ — 26 ,755 4,638 ,948 56t 








* Stratum e-6 consists of institutions with no men enrolled in 1944. Strata e-1 to e-5 inclusive 
were classified by size according to 1941-42 enrollments. This table does not include stratum e-8 con- 
sisting of five institutions in this class for which there were no enrollment data for 1942 or 1944. In 
all such strata a systematic sample of one in five was drawn. 

¢t Does not include one institution sampled from the 5 institutions in stratum e-8. 


EFFICIENCY GAINED BY STRATIFICATION WITHIN THE SIX TYPE 
CATEGORIES 


At this point it is of significance to note the efficiency which results 
in estimating enrollment of one of the six type categories from the 
stratified-random design with optimum allocation as described above. 
The usual means of determining the efficiency of a sample is a compari- 
son with an unrestricted random sample. If there was no stratification, 
that is to say, if all of the teachers colleges had been lumped together 
and a random sample drawn from among them with the same type 
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of estimate as in equation (1), all teachers colleges would in effect 
become one single stratum and the variance of the estimated total 


enrollment would be: 
, =2(5 ~ “) (8) 
= n N-1/ 





The variance for all 184 teachers colleges, c.*, was 62,153. For 
this particular group of 184 teachers colleges for which there were 
data, the variance using equation (8) would be approximately 26,000,- 
000. A stratified-random design was geared to a standard error of 
2,824, or a variance of about 8,000,000. The stratified plan used in the 
survey was, therefore, over three times as efficient as the unrestricted 
random design would have been. The number of cases required on a 
random basis to produce the standard error of 2,824, which was the 
goal of the survey, would be so large that practically all of the 184 
cases, or a complete count, would be required. In other words, simple 
“random” sampling would not be feasible in the type of finite universe 
with which we are dealing. Sampling in a small finite population is 
feasible and worthwhile, if previous knowledge is available on the 
population, and if it can be properly utilized in sample design. 

To further increase sampling reliability for each type category, addi- 
tional stratification according to size and the increase or decrease in 
civilian population 1940-44 in the State in which the institution is 
located was introduced. This substratification was carried to the point 
where each sample case represented a distinct group of schools. Each 
school within each stratum was assured equal chance of selection 
through the use of random numbers. For this design the reliability 
of totals for all higher educational institutions is better than that indi- 
cated by the anticipated .05 relative error. 

Substratification by size was found useful in strata c-6 (colleges of 
arts and sciences—women only), d-7 (technical and professional 
schools—men only), and g-6 (privately controlled junior colleges— 
women only). In each case there were substantial numbers of institu- 
tions heterogeneous as to enrollment. In each case five substrata of 
approximately equal enrollment were established according to 1941- 
42 enrollment. This further step reduced the number of cases required, 
n, according to equation (7) as follows: 


0 ee ee 106 to 51 
0 ee 114 to 67 
i eiwedinwddsda 92 to 39 
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CALCULATING ESTIMATED SAMPLING ERRORS FROM THE SAMPLE ITSELF 


The above procedures in designing the sample involved approxi- 
mations of variances expected in 1946. It was, therefore, desired (a) 
to check the effectiveness with which these approximations served in 
the efficiency of the sample pian, and (b) to inform those using enroll- 
ment estimates of their reliability. 

The sample was designed for estimating total enrollment. Time, re- 
sources, and detail of previous data, did not permit determining in ad- 
vance what reliability the design should produce with enrollment com- 
ponents, such as men, women, veterans, etc. Since there is not perfect 
correlation between each of these and total enrollment, the coefficients 
of variation as designed were expected to be greater than .05. It 
was important to determine by how much before reporting results of 
the survey.‘ 

The variances used in designing the sample and allocating cases 
were based upon complete or nearly complete enumeration of the 
various group universes. Estimating variances actually obtained from 
the sample required the usual modifications in formulae. 

Where s? is the sample variance, i.e., 


p> (x5; ar Z;)* 


jel 





s;° = 


nN; 


in the 7th stratum, an unbiased estimate of the population variance, 


é,?, may be written 
- (N; —_ -)( Ny ) ” (9) 
62 = | ——- }[ — 8;?. ¢ 
( N; /\n;- 1 


An unbiased estimate of the variance of estimated enrollment, 


nh 


Ne jul 








for the 7th stratum would be 


i ee (= . =) 2 (10) 
ose = a ee + 
, Ns N; —1 


4 A plan was anticipated of using ratio estimates if this were found necessary for some of the special 
groups of enrollments. These would have been slightly biased, but would have been expected to yield 
lower sampling variances. See reference [3]. The reliabilities of estimates as outlined in the foregoing, 
however, were found to be acceptable. 
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From (9) this may be written 
. (Ni — ni 
és,° = N;| —— ]}s;?. (11) 
Bee 1 
For convenience in computation the following form was actually used: 


NANi — mi) { 2 a \ 
ng = NA n Z 3 27 - ( > x) [ni (12) 
ns(N¢ = 1) 


j=l j=l 





The variances of estimates over all strata in a group or a combination 
of groups of strata were a summation of the variances in (12) over all 
such strata. That is, the estimated variance és* of our estimate for 
enrollment in a given type of institution of L strata was, from (11): 


L N; — Nn; 
soe Eu(ME™ee a 


i=l nz, — | 


In a few cases where only one institution was sampled in a stratum, 
an estimate was made assuming proportionality with adjacent strata 
on the basis of variances computed for 1941-42 and fall 1943. 

A summary of enrollment estimates and standard errors appear in 
Table 3. 
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SAMPLING METHODS APPLIED TO ESTIMATING 
NUMBERS OF COMMERCIAL ORCHARDS IN A 
COMMERCIAL PEACH AREA 


Francis E. McVay* 
University of North Carolina 


This paper presents the results of a study with two main 
objectives: (1) to investigate an extension of some principles 
recently developed in the theory of cluster sampling, and (2) 
to determine the efficiency of sample segments containing 
specified numbers of farms, as indicated by highway maps, 
for estimating numbers of commercial orchards in a commer- 
cial peach area. 


MATERIAL AND METHODS 


ATA for the study were obtained from a complete enumeration of all 
D peach orchards of 200 or more trees, made during June, 1946, in the 
Sandhills commercial peach area of North Carolina, in connection with 
the crop reporting work of the Agricultural Estimates Branch, Bureau 
of Agricultural Economics, and the North Carolina Cooperative Crop 
Reporting Service. Two hundred and fifty-seven commercial orchards, 
defined as those with 200 or more trees, were found in the area. County 
highway maps indicated a total of 8550 farms. About 3 percent of all 
farms in the area thus represented commercial peach growers. 

To avoid, insofar as possible, the difficulties associated with the ex- 
treme clustering of orchards, the highway maps of the area were di- 
vided into several sets of sample segments, containing an average of 
5, 10, 15, 20, 25, 30, and 50 farms per segment respectively. The head- 
quarters of farms upon which commercial peach orchards were located 
were then marked on the maps. The analysis for each set of sample 
segments, including one-farm segments, was based on an analysis of 
variance suggested by W. A. Hendricks of the Bureau of Agricultural 
Economics. The fundamental equation from which the analysis of vari- 
ance was derived is given by the algebraic identity: 


nkp(1 — $) = kSpi(1 — pi) + kS(pi — 9)? (1) 


where n=number of sample segments 
k=farms per sample segment 


* Agricultural Statistician, Bureau of Agricultural Economics in Cooperation with the Institute 
of Statistics of the University of North Carolina, and Assistant Agricultural Economist, North Carolina 
State College. 
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p;=fraction of farms in respective sample segments which con- 
tained commercial peach orchards, where 1=1, 2, 3, - - -n. 
S(pi) 


p=——-=fraction of farms in the area which contained com- 
n mercial peach orchards. 


The symbolic analysis of variance based on this equation is given be- 
low, together with a numerical analysis of the peach orchard data, using 
the 285 segments of 30 farms each: 


TABLE 1 


ANALYSIS OF VARIANCE OF COMMERCIAL PEACH ORCHARDS 
IN NORTH CAROLINA 




















} - . " ‘ 
a | Symbolic | i umerical 
Variation | 4a 8.8. M.S. | df S.S. M.S 
| -_ | a ee 
ee kS(pi—p)? | " din 
Between segments n—1 kS(pj —p)? —— | 284 68 .8788 . 24253 
n—-l | 
kSp;(1 —p,) 
Within segments | n(k—1)  kSpx(1—Ps) = ay | 8265 178.3188 .021575 
nkp(1 —P) | 
Total nk —1 nkp(1—p) aed ” 8549 247.1976 -028915 
| 





An integral part of the analysis is the assumption that within-seg- 
ment variances bear a linear logarithmic relationship to segment size,! 
of the form o?=ak>. The numerical values of a and b can be computed 
from the mean square within segments of 30 farms and the total mean 
square, the latter being the mean square within the entire finite popu- 
lation of 8550 farms. (See Table 1.) Solving the simultaneous equations, 


log a + b log 8550 = log 0.028915 
loga+blog 30 = log 0.021575 


gives the values: a=0.01809, b=0.05181. When the within-segment 
variance is represented by o?=ak*, the estimated variance of the seg- 
ment averages for sample segments of k farms each in a population of 
N farms, is given by the equation: 

aN 


of = ‘NT N)! — (k — .\b—1 
aay ly - Dan — & - De] (2) 


1 Based on studies by Jessen and Hendricks, reported in (a) Jessen, Raymond J., “Statistical In- 
vestigation of a Sample Survey for Obtaining Farm Facts,” Iowa Ag. Expt. Sta., Res. Bull. 304, 1942, 
(b) Hendricks, W. A., “Relative Efficiencies of Groups of Farms as Sampling Units,” Journal of the 
American Statistical Association, Vol. XX XIX, pp. 266-276. 
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Solving this equation for a wide range of segment sizes gives the values 
shown in the first column of Table 2. 

The next step is to compare the results obtained from this solution 
with the results obtained from an analysis based on variance compo- 
nents. The latter method involves the assumption that within-segment 
variances are constant for different segment sizes. Computations on the 


TABLE 2 


VARIANCES OF SAMPLE SEGMENT AVERAGES COMPUTED BY TWO METHODS, 
AS COMPARED WITH THE OBSERVED VARIANCES 








Variances of Sample Segment Averages 























Number of Computed 
Farms in oe 
Sample By Method 1, Assuming | By Method 2, Assuming — 
Segment Logarithmic Relation of Constant - 
Within-Segment Variance Within-Segment 
to Sample Size Variance 
1 .02892 -02894 .02892 
5 .01319 -01168 .01446 
10 -01058 .00952 01115 
15 -00950 -00881 .00939 
20 -90887 -00845 -00843 
25 -00842 .00823 .00787 
30 .00809 .00809 .00809 
50 .00725 .00780 .00777 





peach data show that they do not remain exactly constant, but, espe- 
cially when segment size is small, tend to increase as segment size is 
increased, as shown in Table 3. As segment size increases beyond 15 
farms, the within-segment variance remains nearly constant. 

Using as a base the actual computed mean squares within segments 
and between segments for segments of 30 farms (the numerical values 
are shown in Table 1), the variances of segment averages for various 
segment sizes are estimated from an analysis of components of variance, 


_using the equation: 


co 
o: - o. +— (3) 
k 
where o?=mean square within segments 


(mean square between segments) — (mean sauare within segments) 


ny 
. 





Os 


farms per segment 


k=farms per segment. 
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Applied to the peach data, again using 30-farm segments as a basis 
for estimation, this formula becomes 


. 24253 — .021575 .021575 
oc; = . -t- (4) 
30 k 





The solutions are shown in column 2 of Table 2, together with the ob- 
served values (column 3). These data are shown graphically in Figure 1. 
Comparison of the observed variances of segment averages with the 


Variance of Segment 
























Average, oO; 
-0300 
+0200 
+0100 
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Legend 
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FIGURE 1 
VARIANCE OF SEGMENT AVERAGES, AS RELATED TO SEGMENT SIZE. 
COMMERCIAL PEACH ORCHARDS, NORTH CAROLINA. 

estimates obtained by using the two methods described, indicates that 
both methods give accurate estimates. The analysis of components of 
variance, however, seems to underestimate the variance when segment 
size is small. For larger segment sizes, of 20 or more farms, there is little 
to choose between the two methods. As indicated above, within-seg- 
ment variances are practically constant as segment size increases be- 
yond 15 farms; the fact that the analysis by components of variance 
gives good results on the larger sample segments means simply that the 
assumptions underlying the analysis are then being fulfilled. 


ESTIMATING THE PRECISION OF SAMPLES 


The logarithmic method of estimating the variances of sample seg- 
ment averages can now be applied directly to the peach orchard data, 
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making possible an estimate of the standard errors to be expected when 
different sampling rates are used on segments of various sizes to esti- 
mate numbers of commercial peach orchards in the area. The standard 


TABLE 3 


OBSERVED WITHIN-SEGMENT VARIANCES OF NORTH CAROLINA 
COMMERCIAL PEACH ORCHARDS 











Within-Segment 





Number of Farms | 
per Segment Variance 
1 0 
5 | .01807 
10 .01982 
15 | .02103 
20 .02169 
25 .02212 
30 .02157 
50 


-02156 





TABLE 4 

STANDARD ERRORS (¢z) OF SAMPLE AVERAGES FOR DIFFERENT SAMPLING 

RATES APPLIED TO VARIOUS SIZES OF SAMPLE SEGMENTS. 
COMMERCIAL PEACH ORCHARDS, NORTH CAROLINA 








Computed Standard Errors (¢#) of Sample Averages 





Number of Based on %, Computed by ; . 
Sample Logarithmic Method Based on o¢ Observed in Data 
Segments ei = és 








Number of Farms per Sample Segment 











| 
| 
Enumerated | Number of Farms per Sample Segment 





5 2 #8| so | 5 | 2 50 
c% | ez | cz of oz of 
1 .11481 | .09407 .08490 | .12021 | .09171 | .08789 
5 05129 .04187 | .03752 | .05370 .04082 .03884 
10 | 03621 .02943 | .02613 .03791 .02869 | .02705 
50 | .01600 | .01251 01013 | = .01675 01220 | .01049 
100 | .O1114 | 00824 | 00549 | .01167 .00803 | .00568 
171 .00833 | 00470 | 0 | 00872 00543 | 0 
250 .00671 00383 | | .00703 | .00373 | 
426 | .00482 | 0 | 00505 | 0 
1000 | 00235 | | 00245 
1710 0 | 0 
Nuxber of | 
Sample 1710 426 171 1710 
Segments 





426 171 
in Population 





| 
Cs | 
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errors are obtained from the sample segment variances previously com- 
puted for segments of various sizes, with a finite population adjustment 
according to the formula: 


paeat(2-4) ‘ 
d: = Oz — oe 
. n N, 


where o;? is the variance of sample averages based on an enumeration 
of n of the N, segments in the universe, and a,’ is the variance of the 


Percent of Sample 
Segments 
Enumerat Sh Error, 
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FIGURE 2 


PER CENT OF SAMPLE SEGMENTS OF DIFFERENT SIZES REQUIRED TO BE 
ENUMERATED TO GIVE ESTIMATES OF NUMBERS OF COMMERCIAL 
PEACH ORCHARDS IN NORTH CAROLINA SANDHILLS WITHIN 
STATED LIMITS OF ERROR IN 67% OF THE CASES, i.e. ¢ 1.00. 


segment averages based on sample segments of different sizes. These 
standard errors require that the selected sample segments be com- 
pletely enumerated. 

Values of o, for different sampling rates applied to segments of vari- 
ous sizes are shown in Table 4. The values based on o,? computed by 
the logarithmic method are compared with those based on a,? observed 
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in the data. The agreement between the computed and observed values 


is seen to be close. 


The fraction of total farms in the area which had commercial peach 
orchards was 257/8550 =0.0301. Figures 2 and 3 show the sampling 
rates necessary on different-sized segments to give, at two confidence 


Percent of Sample 
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Nurber of Farms per Sample Segnent 
FIGURE 3 
PER CENT OF SAMPLE SEGMENTS OF DIFFERENT SIZES REQUIRED TO BE 
ENUMERATED TO GIVE ESTIMATES OF NUMBERS OF COMMERCIAL 


PEACH ORCHARDS IN NORTH CAROLINA SANDHILLS WITHIN 
STATED LIMITS OF ERROR IN 95% OF THE CASES, i.e. ¢ =1.96. 


levels, estimates of the population value, 3 per cent, within specified 


limits of error. Formula (5) is converted to the form 
N,o? 
N,o? + o; 


n 


to compute the number of segments of each size needed in each case. 
The value of o,* used here is computed from the relationship, 
t=Z— m/c, where Z is the sample mean, and mis the population mean. 
Other values are defined as in formula (5). The values plotted in Figures 
2 and 3 are the respective fractions, n/N,. Interpretation of these fig- 
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ures, using the 15 per cent line of Figure 2 as an example, is as follows: 
when single farms are enumerated, 14 per cent of them must be taken to 
insure the attainment, in 67 per cent of a large number of trials, of ac- 
curacy within 15 per cent of the true mean, i.e. 3%+(.15)(3%) 
=3%+0.45%; if 50-farm segments are used as sampling units instead 
of single farms, 67 per cent of the sample segments must be enumerated 


to give similar accuracy. 


It is apparent that acceptable accuracy cannot be achieved at high 
confidence levels with these widely varying data. Using single-farm 
sampling units, an estimate within 15 per cent of the population mean 
would require that 39 per cent of the farms be visited to give an esti- 
mate at 5 per cent confidence levels.? 

In view of the much greater efficiency of the single-farm sampling 
unit, and the relatively low travel costs within the area (the six coun- 
ties of the Sandhills cover only about 2900 square miles), it is likely that 
if a sampling scheme were to be used, the single-farm sampling unit 


would prove most feasible. 


The logarithmic method here developéd is a convenient and easily- 
applied short-cut procedure for estimating the variances of segment 
averages for sample segments of widely differing sizes. It gives greater 
accuracy than does an analysis by variance components, especially 
when the segment sizes under consideration are small. 


2 There is no indication that the Bureau of Agricultural Economics is contemplating the use of a 
sampling plan for this type of estimate. The data given here suggest, in fact, that such a plan would 


not be an efficient approach. 
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GENERAL METHODS OF ANALYSIS FOR INCOMPLETE 
BLOCK DESIGNS 


C. RADHAKRISHNA Rao 
King’s College, Cambridge 


— 


. The problem of combinatorial arrangements. 
2. Analysis of experimental data. 
(A) Intrablock estimates of varietal effects and analysis of 
variance. 
(B) Adjustment for concomitant variation. 
(C) Adjustment for missing or mixed-up plots. 
(D) Recovery of interblock information. 
3. Combined intra and interblock estimates. 
4. Estimation of intra and interblock variances. 
5. The two fundamental designs. 
(A) Partially baJanced incomplete blocks. 
(B) Intra and intergroup balanced incomplete blocks. 
6. An illustrative example: Intra and interblock information. 
Various types of experimental designs have been introduced 
since 1936 for testing a large number of varieties. They are de- 
signed to suit the requirements of the experimenter and with 
the object of achieving maximum efficiency for testing a given 
number of varieties with a limited amount of experimental 
material. The numerical methods for analysis of experimental 
data arising from any design fall under four categories, (i) 
intrablock analysis, (ii) adjustment for concomitant variation, 
(iii) adjustment for missing plots and (iv) recovery of inter- 
block information. The author has attempted in this paper 
to present a unified method for the reduction of experimental 
data to suit all types of designs and methods of analysis. This 
is achieved in two stages. Firstly, it has been noted that 
there are two fundamental designs which include all the pre- 
viously known designs as special cases. Secondly, the formulae 
relating to intrablock analysis have been designed to yield the 
formulae for the other types of analysis with certain changes 
of parameters. The problem of unification is thus reduced to 
the listing of a few formulae relating to intrablock analysis 
in the case of the two fundamental designs termed as the 
partially balanced and intra and intergroup balanced incom- 
plete blocks. An example has been worked out to illustrate 
the practical application of these formulae. 


1. THE PROBLEM OF COMBINATORIAL ARRANGEMENTS 
HE FUNDAMENTAL problem of combinatorial arrangements for in- 
complete block designs refers to v varieties to be tested in b blocks 
of size k<v with the i-th variety used r,; times and the t-th and j-th 
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varieties used together in \,; blocks. The parameters of the design have 
to be specified in view of three requirements. Firstly, the desired com- 
parisons must be estimable and tested with the maximum possible 
precision. Incidentally other comparisons may also be estimated and 
tested with perhaps less precision. Secondly, the total number of plots 
involved in an experiment must not be large so that uniformity of 
agricultural operations may be maintained over all parts of the field 
and the cost of operations may not be heavy. Thirdly, the computa- 
tional prccedure for analysis and interpretation of observed data must 
be simple and mechanical so that technicians interested in any field of 
work may, without any difficulty, analyze for themselves the observa- 
tions from their experiments. 

Two solutions to the above problem known as the balanced incom- 
plete block and quasi-factorial designs were suggested by Yates [12 and 
13]. The limited number of designs belonging to these types and the 
varied requirements of the experimenter have led other workers to 
discover new types of designs. The first fruits of this search are the 
partially balanced designs introduced by Bose and Nair [2], and later 
generalized by Nair and Rao [8] to cover all previously known designs 
and open out a wide variety of possible experimental arrangements. A 
special class of this design known as quasi-factorial (Nair and Rao [7]) 
leads to confounded designs in symmetrical and asymmetrical factorial 
experiments as well (Nair and Rao, [5] and [6]). Later a second type of 
design known as the intra and inter group balanced design was in- 
troduced to cover some new situations which arise in varietal trials 
(Nair and Rao [9]). These two types attempt to cover the whole field 
of experimental designs including as special cases all the well known 
designs. The various special classes of designs derivable from the two 
types are diagrammatically represented below. 


Arrangements for varietal trials 








| 








Generalised partially Intra and inter group 
balanced balanced 
| 
' 
Quasi-factorial Lattice Other types Method of controls 


Balanced incomplete 
block 


2. ANALYSIS OF EXPERIMENTAL DATA 


The unification achieved by the introduction of these two types of 
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combinatorial arrangements has resulted in a certain amount of 
mechanization of the arithmetical discussion of observed data as well. 
There are four types of analysis used in the reduction of experimental 
results. 

(A) Intra block estimates of varietal effects and analysis of variance. 
When the varieties in a block are randomized, intra block comparisons 
of yields allow us to compute unbiased estimates of varietal differences. 
These estimates are subject to an error which depends solely on the 
fertility differences inside a block and when the experimental material 
is very heterogeneous the efficiency of comparisons can be increased by 
diminishing the block size. The arithmetical discussion leading to tests 
of significance is known as the analysis of variance. 

This method of analysis is made available for all types of designs by 
listing a few formulae for the two fundamental types. These formulae 
as shown in a later section can be easily computed, knowing the 
parameters of a design. 

(B) Adjustment for concomitant variation. This is a device by which 
the precision of varietal comparisons can be enhanced by removing the 
variation caused by variables observable with the yield under con- 
sideration. Care must be taken to see that such variables known as 
concomitant variates are not affected by the qualities inherent in a 
variety. In fact random variaticns in the concomitant variates should 
produce corresponding variation in the yield. 

Elsewhere [10], [11] I have discussed the computational procedure for 
adjusting for concomitant variation in any type of design. No fresh 
formulae are needed and the analysis can be carried out without heavy 
computation. We need only repeat the method of analysis (A) with 
each concomitant variate and set up the analysis of variance and 
covariance table by suitable calculations. 

(C) Adjustment for missing or mixed-up plots. Some amount of 
mechanization is introduced by the methods developed by Bartlett [1] 
in the case of missing plots and Nair [3] in the case of mixed-up plots. 
In the case of missing plots we introduce a pseudovariate corresponding 
to each missing plot such that it has the value 1 for the missing plot and 
0 elsewhere. The value of yield is of course taken as 0. The number of 
pseudovariates is equal to the number of missing plots and the analysis 
of variance and covariance may be carried out as in (B). 

In the case of s plots yields getting mixed up leaving a known total u, 
we introduce (s-1) pseudovariates 2, 22 + + « Z,-; all having value 0 in 
the unaffected plots and values as shown in Table 1 for the s affected 
plots. 
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The analysis of variance and covariance table may be set up with 
the pseudo-variates as concomitant variates. 

(D) Recovery of interblock information. When incomplete block de- 
signs were first introduced attention was directed to methods for ex- 
tracting the intrablock information which gives estimates of varietal 
differences built up from comparisons arising within a block. The 


TABLE 1 


VALUES ASSIGNED TO THE PSEUDO- AND OBSERVED 
CHARACTERS IN EACH OF MIXED UP PLOTS 














Pseudo-characters 
Mixed up plot — —| Observed characters 

| Z1 Ze: a | 

— — 
1 1 1 eee 1 u/s 
2 l—s 1 eee 1 u/s 
3 1 l—s eee 1 u/s 
8 I 1 eee 1—s u/s 








comparisons of block totals were neglected on the score that they are 
subject to a greater variability. If, however, the differences in fertility 
of the various blocks were in fact small some loss of efficiency would 
result as compared with arrangements in ordinary randomized blocks. 
The technique by which block comparisons are taken into account in 
arriving at varietal effects greatly enhances the value of these designs 
for when this information is utilized they can in no event be appreciably 
less accurate than the ordinary randomized blocks containing all the 
varieties. If the ratio of the variances for inter and intra block com- 
parisons is sufficiently greater than 1, these estimates from incomplete 
block designs will of course be considerably more accurate. 

The methods for recovery of interblock information appropriate to 
the designs introduced by Yates have been considered by Yates him- 
self in a series of papers (Yates, [15], [16] and [17]). In the case of 
partially balanced designs the method of solution is indicated by Nair 
[4]. The object of this paper is to show that no fresh formulae need be 
listed for this type of analysis and by certain changes in parameters of 
the design the formulae appropriate to intrablock estimates discussed in 
(A) can be used for getting the combined intra and interblock estimates. 
This makes the method of analysis very simple for practical applica- 
tions. 

Thus it follows that a few appropriate formulae giving intrablock 
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estimates in the case of the two fundamental designs, the generalized 
partially balanced and the intra and intergroup balanced incomplete 
blocks, would serve all the designs derivable from them and all the 
methods of analysis used in the reduction of experimental data. The 
formulae are listed in section 5 for practical use. Some of the formulae 
are reproduced from the series of papers by K. R. Nair and the author. 
Others have been so modified by a suitable choice of the constraining 
relation involving the varietal effects as to be useful for the derivation 
of the combined intra and interblock estimates as well. 


3. COMBINED INTRA AND INTERBLOCK ESTIMATES 


Let us consider the case of a general design with v varieties to be 
tested in b blocks of size k such that the 7-th variety is used r; times and 
i-th and j-th varieties occur in \,; blocks. If the varieties in a block 
are randomized then each block supplies (k—1) comparisons which are 
subject to an error depending only on the fertility differences of plots 
in a block. Let these be represented, if y:, - - - , yx are the yields of k 
varieties from the 7-th block, by 


tii tess + tye, j3=1,2,---,k-1 (3.1) 
such that 
Yi3=0, Dav=1, Doig,’ =0 for jxj’. (8.2) 


The conditions (3.2) ensure that the functions (3.1) are uncorrelated 
and subject to the same error. The expectation of each such function is 
equal to the corresponding linear function of the varietal effects. There 
are b (k—1) functions of the type (3.1) arising from the b blocks which 
are all independent and subject to the same error variance which may 
be represented by o?. They supply b(k—1) observational equations 
and the estimates of varietal comparisons from them are called intra- 
block estimates. 

If the b block totals are represented by B,, Bz - - - By and the cor- 
responding mean values by placing bars over them then the 6 mean 
values may be replaced by (b—1) comparisons 


m;B, + - ++ + ma;Bo, j3=1,2,---,b-1 (3.3) 
such that 
» m,; = 0, Zz m,;? = 1, » msm; = 0 for j #7’ (3.4) 


and a b-th function representing the grand mean. If the blocks are 
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randomized the (b—1) functions (3.3) become stochastic variates un- 
correlated with the b(k—1) functions (3.1) and have as their expecta- 
tions the corresponding linear functions of the varietal effects. These 
are subject to an error which depends not only on fertility differences 
inside a block but also on fertility differences between blocks if they 
exist. If o’? is the error variance per plot for these comparisons built up 
from block means then 


a’? = go? + kp? (3.5) 


where k is the block size, o? is the intra block error variance and 6? is the 
inter block error variance arising out of fertility differences per plot of 
blocks. The estimates of varietal comparisons, if estimable, from the 
observational equations supplied by (3.3) are called inter block esti- 
mates. 

The best estimates of varietal comparisons, in any case, have to be 
found from the two sets of functions (3.1) and (3.3) which are subject to 
two different errors. The observational equation corresponding to the 
grand mean involves a constant which depends on the general fertility 
of the experimental field and hence is of no use in estimating varietal 
comparisons. The best estimates can be found by the method of least 
squares as discussed by Rao [11]. 

If the observational equations corresponding to the functions (3.1) 
and (3.3) are written as 


0; = fi(r) #=1,2,---,b(k — 1) 


3.6 
0.’ = fi’(7) +=1,2,---,b-—1 aie 


where 7’s stand for varietal effects and f’s are linear functions then the 
expression to be minimized is 


L = w)>> [0; — fi(r)]? + kw’ D> [0,’ — f.’(7)}? (3.7) 


where w=1/o? and w’ =1/¢”. 

If the intrablock estimates are required one need only minimize the 
first expression in (3.7). It may be noted that when the sets of varieties 
are not assigned to blocks at random but only the varieties in a set are 
randomized within a block, the experiment can supply only intrablock 
estimates. The second set of observational equations in (3.6) do not 
supply, in this case, unbiassed estimates of varietal comparisons and 
hence, the estimates have to be based on the first set of equations only. 

The s-th normal equation giving an intrablock comparison is ob- 
tained by equating the derivative of wZ(0;—f;(r)}? with respect to 7, 
to zero. To simplify the algebra one may first find the contribution to 
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the derivative due to the sum of squares arising from a block in which 
the s-th variety occurs and then get the total. Taking the expressions 
from (3.1) one finds the derivative arising from comparisons in a block 


as 


as 20 | > Dd tise — the corresponding expression in| 


rj 7’s instead of y’s 


= — 2w |v. _ ; — the corresponding expression in rs | 


since under the conditions (3.2), 


k-1 1 


and > ijire3 = ——- 
j k 





> 1;* = 


Taking the total and equaling to zero one has the equations giving intra 
block estimates as 


wd, = wir. & — —eee , (3.3 
k ko k 
s=1,2,---,v 








where Q;=sum of yields for the 7-th variety minus the sum of means of 
blocks in which the i-th variety occurs, and t=the estimate 
of T2. 

If separate interblock estimates are needed (possible only when sets of 

varieties are assigned at random to blocks) we minimize the second 

sum of squares in (3.7). This leads to the equations, 


mn? ’ ri r;? Aa TT; 
w’Q,’ =w (= _ ~)t + (= _ =)" + 
+ (= - = yt, | (3.9) 
k bk 


where Q;’=sum of means of blocks in which the 7-th variety occurs 
minus 7; times the grand mean. 
The method of deriving (3.9) is the same as that used for (3.8). Using 
the constraining relation 


> rt; = 0 (3.10) 


the equations (3.9) giving interblock estimates only may be written as 
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Q,! [Ea + | “=, | (3.11) 
w'Q,’ = w’| — t; + — oo + + —Z, |]. ; 
k .” k 


Adding corresponding equations in (3.8) and (3.11) we get the equations 
giving the combined estimates as 





k—-1 Ay fin 
P, = r(*=*),, - - ae 6 ee t, (3.12) 


where 
R; = r;[w + w'/(k — 1)] 
Ai; = j(w — w’) (3.13) 
P; = wQ; + w’Q,’. 


The solutions for (3.12) in terms of P, R and A can be seen to be the 
same as those for (3.8) and (3.10) in terms of Q, r and \. But equations 
(3.8) with the restrictions (3.10) give intra block estimates. Hence the 
combined estimates can be obtained from the expressions for intra- 
block estimates by the changes indicated in (3.13). The expressions for 
variances of varietal differences derived from the combined equations 
(3.12) can, for the same reason, be obtained from the corresponding 
expressions (omitting the multiplier o”) for intra block estimates by 
making the above changes from r to R and X to A. Thus inter block 
estimates and their variances need no fresh formulae to be given. 

If w and w’ are accurately known then a test for varietal comparisons 
is supplied by the statistic, 


x? = Do tP; (3.14) 
t=1 
which can be used as x? with (v—1) degrees of freedom, ¢’s being 
solutions of (3.12). When the weights w and w’ are estimated on a large 
number of degrees of freedom the above test can be used as an ap- 
proximation. No exact test is, however, available when small degrees 
of freedom are involved. 
When w=w’ we get from the equations (3.12), 
V; 
t; = — — grand mean (3.15) 
rT; 
where V;, is the total yield of the i-th variety. It immediately follows 
that when the intra and inter block errors are of the same magnitude 
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the design can be analyzed as simple analysis of variance of between 
and within varieties. 

If, however, the design is resolvable into r complete replications 
(possible only when 1=7:= --- =r,=r) and the blocks within a 
replication are randomized there are only (b—r) functions of the type 
(3.3) and the error variance ¢ ? stands for inter block error within a 
replication. The normal equations giving the varietal effects are the 
same as (3.12). In this case if w=w’ then the analysis of the design 
reduces to that of a simple randomized block. In any situation if on 
removing the variation due to replications the intra block error and 
inter block error within a replication differ by a small magnitude, then 
analysis of the design as a randomized block instead of the exact but 
more elaborate method may lead to the same interpretation of data. 


4. ESTIMATION OF INTRA AND INTER-BLOCK VARIANCES 


The estimation of ¢ and a’ involves some difficulty as the equations 
leading to their best estimates are complicated. A simple method by 
which fairly accurate estimates can be obtained, in any general case 
is as follows. We set the analysis of variance as in Table 2. 


TABLE 2 
ANALYSIS OF VARIANCE FOR ANY GENERAL DESIGN 











d.f. 8.8. 8.8. d.f. 











Blocks (ignoring varieties) b-1 Ui —=S;, b-—1 Blocks (eliminating varieties) 
Varieties (eliminating blocks) »—1 LtHQs Us t—1 Varieties (ignoring blocks) 
Error - —#=S, — & — Error 

bk—-1 U; —~ U; bk—1 





The expressions U;, U2 and Us are calculated in the usual manner. If B 
represents a block total and V a variety total then 


1 . 
U; 7 >> B? — cf. (correction factor) 
U; = > y? — c.f. 
V;? 
U;= > 
Tr; 
The expression 2/,Q; where ¢; are solutions of (3.8) is the usual expres- 


sion for sum of squares due to varieties in intra block analysis. The 
expressions indicated by — sign are obtained by subtraction. The 





— c.f. 
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arrows indicate that expressions have to be carried from the left to the 
right-hand side of the table. 
The estimates of o? and 6? are obtained by equating V; and V2 of the 
above table to their expectations. 
E(V,) = (bk —b —v + 1)o? 
E(V2) = (b — 1)o? + (bk — v)p? 


(4.1) 


so that 
o? ~ V,/(bk —b —v +1) 


1 (v —_ k)V, 
iz - | (4.2) 
k-—v bk —b—v+1 


In case the design is resolvable into r replications (only when 
ry== +--+ =r,=r) and the blocks within a replication are random- 
ized the sum of squares due to replications have to be removed from 

‘2 to get o” which now represents the interblock intra-replication 
error. 


and 








2 = gt ke? w~ 
o o B ; 





d.f. S.S. 
lacie R? 
Replications r—1 DR ot. 
v 
Blocks (within replication) b—r —=V; 
Blocks (eliminating varieties) b—1 V2 


The estimates of o? and #? are obtained from the equations 
E(V,) = (bk —b —v + 1)o? 


(4.3) 
E(V3) = (v — 1)(r — 16? + (b — r)o?. 


5. THE TWO FUNDAMENTAL DESIGNS 


(A) Partially Balanced Incomplete Blocks. The combinatorial prob- 
lem leading to the generalized class of partially balanced designs is as 
follows: 

(a) There are v varieties to be tested in b blocks of k plots, each 
variety being replicated r times. 

(b) Given any variety it is possible to divide the rest of the varieties, 
by some means of discrimination, inte groups of sizes mi, m2 - + + and 
N» Which may be called its first, second, - - - and m-th associates re- 
spectively. If we denote the number of blocks in which the 7-th and 
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j-th varieties occur by \,; then it is necessary for our design that 
\ij=A, whenever the j-th variety is the s-th associate of the 7-th variety, 
\, being independent of 7. There are as many }’s as there are associates 
and the parameters \;; for any « are such that mn; of them are equal to 
hi, MN Of them equal to Az - - - and n,, of them equal to Am. It is easy to 
see that when X’s are different they offer a means of discrimination of 
the nature of associations as first, second, etc. associates. Otherwise, a 
‘tactical configuration’ of the varieties from which the nature of as- 
sociation is derivable has to be annexed to the design. This, in most 
cases, is supplied by the method of the construction of designs. We shall 
have to keep the X’s distinct with special reference to the nature of 
association. The following relations identically hold. 


MmMtmte ++ +n =v—1 
MA, +-:- + Nmdrdm = r(k — 1). 


(c) Let p,.* be the number of varieties common to the r-th associ- 
ates and s-th associates of a pair of varieties which are k-th associates. 
For the design this quantity is independent of the pair of the k-th as- 
sociates we start with. 

Now it follows that 


Pra® = Dor* 


NPre® = 1;Pi." = NsDrk® 


> pet = nm — 1 or m according as k = r or k ¥r. 


s=1 


The quantities n’s and }’s are called the first system of parameters and 
the quantities p,,* the second system of parameters of the design. 
Apparently the minimum number of associates is equal to the num- 
ber of different \’s. But the minimum for the design is that number for 
which the second system of parameters satisfy the conditions of the 
previous paragraph. Sometimes an artificial grouping is introduced by 
the splitting of one or two groups in the minimum set for the design. 
This is at once detected by the second system of parameters. By 
suitable row and column additions and omissions in the matrices 
(pre*)(r, s=1, 2, - - - m) we can get the parameters of the design with 
the reduced number of associates. In case there is an artificial grouping 
some of the matrices after reduction become identical. Only one of 
them has to be retained for the reduced design. 
This design, as originally introduced by Bose and Nair [2], had the 
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restriction that all \’s are different. This restriction is now withdrawn 
in order that it may cover a wide variety of designs. 

The analysis of variance appropriate to partially balanced designs 
involving two and three associates is given below as they are the most 
useful cases in practice. Only intra block estimates are given as the 
necessary formulae for inter block estimates are derivable from them 
by changing the parameters as indicated in (3.13) 


Let Q;=sum of yields for the 7-th variety minus the sum of means of 
blocks in which it occurs. 
>Q;;=sum of Q’s for the i-th variety and its j-th associates. 


The varietal effects from which comparisons are estimated are 
k ~~. . 
= - [(Boe + Bi2)Q: — Bir dU Qa], ¢=1,2,-++,v (5.1) 


where 

=r(k—1)+ 2 

(Az — 1) Pas? 

By = (Az — A) 

r(k — 1) + 2 + (Ae — x) (Pu? — pu?) (5.2) 
= A».Br _ A2Bi. 


It is easier to calculate the varietal effects by the above formula when 
n; Sm. If ny =n an alternative expression is 


ie 
— 
~ 
to 

| 


> 
T 


& 
> 8 
i Wl 


k 
A [(Ba + Bu)Qi - Bu). Qis) | (5.3) 


v; 


where 
Au = r(k = 1) + A, An = (Ay —_ 2) Pie! 


By = (Ar — As) 
Bu = r(k — 1) +1 + (Ar — Az)(Do2® — par") (5.4) 
A= AunBu = AnBu. 


The sum of squares due to varieties is, as before, 
y v:Qi. (5.5) 


There are two types of comparisons to be made and the appropriate 
variances of differences are 


































be 





ANALYSIS OF INCOMPLETE BLOCK DESIGNS 553 


V(v; — v;) = 2kBuyo?/A_ if i and j are first associates 
—_ oe (5.6) 
= 2kBno?/A if ¢ andj are second associates. 





a The over-all efficiency of this design is 
108 
th : (v—1)A 
baat EF. = . (5.7) 
nem rk[Bu(v — 1) + mBu] 
In the case of designs with three associates we have to calculate 
s of 


Ay = r(k — 1) +As, is = As — Ai, Cis = As — Ae 

Ass = (As — Ax)(M1 — Pu®) — (As — As) Dia? 

Ags = (As — Ae) (ne = P2*) — (A3 — 1) P12 

Bos = r(k — 1) + As + (As — 1)(put — pu®) + As — 2) (po! — pr:*) 

5.1) Bss = (Xs — Ar) (Piz? — Prs*) + (As — Az)(Pi2!? — poe) 

Cos = (As — Ai)(Pu? — Pu®) + (As — Az)(Pi2® — Pis*) 

Css = r(k — 1) + As + (As — A) (Pia? — prs®) + (As — Az) (P22” — 22°) 
F BaC33 — Bs3C23, G = BsCi;3 — By3C 3s, H = ByC23 — BosCis 
A = Ay3F + AnG + AsH. 


The estimates of varietal effects for comparisons are 





5.2) 
k 
v; mae \(F —_ G - H)Q: + Gd Qa + HY Qe) i= 1, 2, oe, 
hen 
The variances of estimated differences are 
. V(v; — v;) = 2ko*?(F — G)/A if i andj are first associates. 
5. 
) = 2ko*(F — H)/A if i andj are second associates. 
= 2kF/A if 7 and j are third associates. 
The over-all efficiency factor is 
(v—1)A 
5.4) [(v — 1)F — mG — nH ]rk 
(B) Intra- and Intergroup Balanced Incomplete Blocks. The funda- 
mental problem of arrangements as enunciated in the introduction can 
be associated with the problem of balancing which requires that a given 
5.5) set of comparisons are estimable with equal efficiency. When once the 
‘ate comparisons on which balance is required are specified, the parameters 


of the fundamental problem can be deduced as solutions to a mathe- 
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matical model. One need only evaluate the variances of the estimates in 
terms of variables r; and \,;; giving the number of replications for the 
i-th variety and number of blocks in which the 7-th and j-th varieties 
occur together and express the conditions for their equality. These are 
however, obtainable in simple cases from purely empirical considera- 
tions. The first intuitive solution, the balanced incomplete block in- 
troduced by Yates achieves complete balance over all possible com- 
parisons. Now we may go a step further and enquire whether there 
exist designs in which the varieties can be thrown into several groups 
such that there is balance on all the comparisons arising from any group 
and also on comparisons arising from varieties belonging to two differ- 
ent groups. This is desirable in view of the fact that in agronomy we 
usually come across experiments involving varieties which can be clas- 
sified into two or more groups corresponding to certain characteristics. 
For instance, the experiment may be concerned with some local strains 
and some newly introduced strains in which case we may have to 
differentiate them into two groups such that all comparisons arising 
out of new strains may have equal efficiency and so also all comparisons 
between local and new strains. Incidentally differences in local strains 
may also be calculated and tested with equal but less efficiency. Or 
again, we may be conducting an experiment wherein some varieties 
need early sowing and others late sowing or some are early ripening and 
others late ripening. This problem is sought toe be tackled by the in- 
troduction of a rew series of designs called the intra and inter group 
balanced. 

The combinatorial problem leading to an intra- and inter group 
balanced design is as follows. There are v varieties, which fall into m 
groups consisting of 1, v2, +--+ and v, varieties, to be arranged in b 
blocks of k plots each such that 

(a) every variety of the i-th group is replicated r; times 7 =1, 2, - - -m 

(b) every pair of varieties of the i-th group occurs in i, blocks 

({=1, 2, -+-+m) and every pair consisting of varieties from two 
different groups i-th, j-th occurs in ),; blocks. 

The parameters introduced above satisfy the following conditions 


hy = Aji 
bk = ry. + +++ + Tint 
rj(k — 1) + Aggy = 11Aje ++ H+ UmAjm 
j3=1,2,--+,m. 


The computational methods leading to intra block analysis for the 
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practically useful designs involving two groups can be set out as shown 
below. 
Let Q,;=sum of yields of the j-th variety in the 7-th group minus the 
sum of means of blocks in which it occurs. 
Q;. = 2Q:; (summed over varieties in the i-th group) 
v;;=effect of j-th variety of the 7-th group 
If there are two groups with the parameters 


v1 Ve b Au Aw 


ry T2 k dar Aoe 


then the varietal effects are 





uu; = = | a, + AuQire + ed 
Si Arbk 

k | redaQi. + ete) 

SoL° Aibke 


I 





V2; 


where 
Si = Auti + Arve and Sz = AoW; + Arse. 
There are three types of comparisons with their associated variances as 
given below 
Vivir — Vis) = 2ko?/S, 
V (vor — Vos) = 2ko?/S2 
V (v1, = Vos) ko? [da2(Si + S2) + AnA2 — Aro? |/AraS1Se. 


The sum of squares due to varieties (eliminating blocks) is 


Dd 13Qi5 + Do v2Qe2; 
j j 


which is most convenient for computation. The analysis of variance 
leading to the estimate of o? and tests of significance can be set out in 
the usual manner. 

Combined intra and inter block estimates and the variances are ob- 
tained by the changes given in (3.13). 

6. AN ILLUSTRATIVE EXAMPLE: INTRA AND INTER-BLOCK INFORMATION 

The following data have been obtained from the example 2 given in 
Yates [17] by omitting all blocks containing the variety ¢ and taking the 
yields to the nearest integer. The layout and yields of the remaining 
blocks are as given in Table 3. 








RwRoo ® 


f 
e 


Total: 


Total: 


223 


TABLE 3 
THE LAYOUT AND YIELDS FOR 20 VARIETIES 

m 32 r 46 ge 49 j 59 e 31 k 25 n 55 
1 44 c 37 a 34 u 73 e 35 f 47 o 38 
j 52 se f 46 o 49 a 41 u 51 a 39 
a 51 q 27 h 55 e 47 b 59 p 59 q 47 
k 32 h 37 i 63 h 78 d 45 d 51 p 48 

211 164 247 306 211 233 227 
i 65 n 57 g 67 f 67 b 67 m 45 j 7 
d 47 zg 55 r 71 b 71 u 57 d 62 p 69 
o 50 u 55 m 46 r 46 i 66 n 77 ec 46 
1 58: ec 39 e 43 n 43 m 32 s 47 s 50 
r 65 1 51 p 65 j 65 q 49 h 82 1 65 

278 257 292 419 271 313 30 


The original design is a balanced incomplete block with 21 varieties 
and 21 blocks. The reduced design is partially balanced with 20 varie- 
ties and 16 blocks. The other parameters of this design are calculated 


Ay = 0, deo = : ny = 3, No = 16 


| (° ») : (° *) 
ij = ; 7 = b 
ves 0 16 vss 3 12 


The calculations leading to intra block analysis and estimates of 
intra and inter block error variances may be set out as follows. Ti 
start with the constants defined in (5.2, 5.4) are calculated 


r(k —1) +2. =17 


4112 
Ag = (2 — Ai) Pr? = 3 
A = A 2Bo = Ax»Bi = 3520 

By = de ana M1 = ] T 
Bu = r(k = 1) a Ae 

+ (Ae — Ax) (pu! — pu?) = 19 Check 
Ay = r(k - 1Ih+r = 16 
Ag = (Ai — re 2! = 0 | 

, ( . MPs A= AyBa — AnByy = 320 

By = Ay —_ Ae soo= 1 
By = r(k —_ 1) a Ay 


+ (Ar — As) (p22? — poo’) = 20 


Sum of squares due to varieties (eliminating blocks) 


- > (y)(2) = 
5 


306.94. 
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TABLE 4 


ESTIMATION OF VARIETAL EFFECTS (INTRA BLOCK INFORMATION) 
(Grand Mean =51.61) 























Variety (a) (B) (y) (3)* (e) (&)t (9) (ft) 
a 165 896 —- 71 a,r,s8,u — 49 —1371 — 4.28 47.33 
b 245 1074 151 b, h,l, p 541 2479 7.75 59.36 
° 157 936 -—151 ec, f, m,o —524 —2496 — 7.80 43.81 
d 205 1035 — 10 d,g,i,q 171 — 371 — 1.16 50.45 
e 162 1032 —222 e,i, k,n —139 —4301 —13.44 38.17 
f 230 1122 28 c, f, m,o —524 1084 3.39 55.00 
zg 208 969 71 d,g,i,aq 171 1249 3.90 55.51 
h 252 1030 230 b, b, 1, p 541 4059 12.68 64.29 
i 259 1100 195 e,i, k,n —139 4039 12.62 64.23 
j 283 1240 175 d,g,i,q 171 3329 10.40 62.01 
k 95 781 —306 e,i, k,n —139 —5981 —18.69 32.92 
l 196 969 ll b, bh, lp 541 — 321 — 1.00 50.61 
m 155 1087 —312 c, f, m,o —524 —5716 —17.86 33.75 
n 282 1216 194 e,i, k,n —139 4019 12.56 64.17 
o 179 984 — 89 c, f, m,o —524 —1256 — 3.93 47 .68 
p 241 1056 149 b, bh, 1, p 541 2439 7.62 59.23 
q 164 885 — 65 d,g,i,q 171 —1471 — 4.60 47.01 
r 268 1153 187 a, r, 8, u — 49 3789 11.84 63.45 
8 147 1013 —278 a,r,s8,u — 49 —5511 —17.22 34.39 
u 236 1067 113 a,r,8,uU — 49 2309 7.22 58.83 
4129 20645 0 0 0 0 1032.20 
5 X4129 =20645 Check =20 X51.61 
(«) Total yield of a variety. (8) Sum of block totals in which the variety occurs. (vy) =k(a) —(8) =kQ 
(bere k =5). (8) Variety and its first associates. (¢) Sum of (7) for varieties of (6). (€) =[(But+Bz)(y) 
—Bir(e)] —(m)(€)/4. (¢) =(n)+egrand mean =—adjusted values for varieties. 








* The first or second associates are considered, whichever is smaller in number. Usually the sets of 
varieties in this column are identical for (ni +1) (here 4) varieties. 
t The formula for second associates is (Bu+Bu)(y) —Bu(e). 


Sum of squares due to blocks (ignoring varieties) 
= 3(1732 + 2117+ --- + 304?) — 213108.01 = 11700.59. 
Sum of squares due to varieties (ignoring blocks) 


= 3(165* + 245? - -- ) — 213108.01 = 12927.74. 


Total sum of squares 
> y? — 213108.01 = 20868.99. 
The analysis of variance leading to tests of significance concerning 


intra block estimates and estimates of intra and inter block variances 
isset out in Tables 5, 6 and 7. 
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TABLE 5 
ANALYSIS OF VARIANCE 
d.f. 8.8. 8.8. d.f. 
Blocks (ignoring varieties) 15 11700.59 6079 .79* 15 Blocks (eliminating varieties) 
Varieties (eliminating blocks) 19 7306 .94 12927 .74 19 Varieties (ignoring blocks) 
Error 45 1861.46*— 1861.46 45 Error 
Total 79 20868.99 — 20868.99 79 
* Obtained by subtraction. 
TABLE 6 
TEST FOR VARIETAL SUM OF SQUARES 
d.f. 8.8. m.8. Ratio 
Varieties (eliminating blocks) 19 7306 .94 384.57 9.29 
Error 45 1861.46 41.37 


The ratio 9.29 on 19 and 45 degrees of freedom is significant. 


TABLE 7 
ESTIMATION OF INTRA AND INTER BLOCK ERROR VARIANCES 
m.8. Expectation 
Blocks (eliminating varieties) 6079.79 150? +6082 
Error 1861.46 4502 


This gives 
o? = 41.37 and # = 90.99 


hence 


o?= .02417 
‘oa’? = 002015 


w= 


o” = 41.37 + 5 X 90.99 = 496.32 
1, 
1, 


w’ = 
Having estimated the relative weights w and w’, the following cal- 
culations have to be made 
R = r(w + w’/(k — 1)) = .09870 
Ae = A2o(w — w’) = 0.02215 
Ai(w — w’) =0 
Au’ = R(k — 1) + Ay = .41695 
Ag’ = (Az — Ai)pu? = .06645 A’ = .19085 
By’ = Ag — Ay = .02215 
Boo! = Ars’ + Biz'(put — poo') = .46125 
= R(k — 1) + Ai = .39480 
An’ = (Ai — A2)pu! = 0 . 
Di! «hy ~ dm ~ SOS A’ = .19085 
Au’ + Bi’ (pes? — p22!) = .48340. 
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Before proceeding to calculate combined intra and inter block esti- 
mates it is instructive to examine whether considerable increase in 
efficiency can be achieved by extra computation. 

The variance for differences of varietal effects built from intra block 
comparisons are, as given in (5.6) 

2kBx ' 10 X 20 


g 





X 41.37 = 25.8562 





A 320 
and 
QkBo 10 X 19 
—g? = ——— XX 41.37 = 24.5634 
A 320 


for first and second associates respectively. There are 30 comparisons 
of the first kind and 160 comparisons of the second kind so that the 
average variance per comparison is 
3 X 25.8562 + 16 X 24.5634 
19 





= 24.7675. 


The same formulae hold good for combined intra and inter block 
estimates. Thus the variances are 


2k Boy’ 0 .48340 














= x = 25.3288 
A’ .19085 
and 
2k Bx’ .46125 
oo oe = 24.1682 
A’ .19085 


for the two types of comparisons. The decrease in variances when com- 
pared with the above is very small. The average variance per compari- 
son in this case is 24.3514 which differs very little from 24.7675 calcu- 
lated above. In practice the inter block analysis in such cases may not 
be undertaken. I have carried out the computations to show that they 
can be obtained by merely extending the Table 4 used to calculate 
intra block estimates. 
To test for the significance of varieties we calculate 


(do A)(d D) 
v 





1 
= t > (A)(D) - 


= $(3078.3612 — 2152.2815) 
= 185.2159 
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which on 19 degrees of freedom for varieties is highly significant. This 
test is valid only when inter block error is estimated on a large number 
of degrees of freedom as in this case. In the above formula for x? the 
correction term is introduced in calculating the expression for P since 
in column (A) of Table 8 a multiple of the grand mean is omitted. 


TABLE 8 
COMBINED INTRA AND INTER BLOCK ESTIMATES 




















Variety (A) (B) (C) (D) 
4 .094 7.156 — .593 46.704 
b 5.819 21.416 12.253 59.550 
c —1.759 — 4.325 -- 3.953 3.344 
d 1.849 12.474 3.235 50.532 
e —3.281 4.981 — 8.889 38.407 
f 2.943 — 4.32 7.957 55.254 
g 3.673 12.474 7.856 55.153 
h 7.640 21.416 16.864 64.161 
i 6.935 4.981 16.988 64.285 
j 6.73 12.474 15.610 62.905 
k —5.818 4.981 —15.316 31.981 
| 2.223 21.416 3.145 50.442 
m —5.345 — 4.325 —13 .037 34.260 
n 7.145 4.981 17.520 64.816 
° — .163 — 4.325 .O88 47 .385 
p 5.734 21.416 12.038 59.335 
q 217 12.474 — .899 46.397 
r 6.849 7.156 16.517 63.816 
8 —4.673 7.156 —12.667 34.630 

4.887 7.156 11.546 58.843 
41.7028 86 .263 1032 .200 


(A)* =kw(a) —(w —w’)(8) =5P (where (a) and (8) are from Table 4). 

(B) =Sum of (A) for the variety and its first associates. 

(C) =[( Bu’ +B2')(A)—Bu'(B)]/A’ (same formula used in Table 4) 

(D) =(C) —mean of values in (C)+grand mean (51.61) =adjusted varietal effects. 








* The actual formula is w[k(a) —(8)]-+w’[(8) —r times the grand mean]. Since the replication for 
each variety is the same the last term may be omitted and an adjustment for that may be madein 
calculating (D) as shown above. 


Tests of individual comparisions may be made by comparing the ob- 
served differences with standard errors as calculated in the article. No 
exact test seems to be possible. 

The problem of design of experiments is not completely solved with- 
out the preparation of a Dictionary of Designs from which the experi- 
menter can be supplied the best or the only possible design satisfying his 
requirements as to the number of varieties, extent of experimental land, 
etc. A large number of designs constructed by the author is awaiting 
publication. Much mathematical research is necessary before all useful 
designs can be bagged. 
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TUMBLER MORTALITY 
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I. INTRODUCTION 


1. The statistical problem. Life tables are constructed for 
certain types of equipment, under specified conditions of use, 
from records of experience with the equipment in service. 
Items such as electric lamps, glass tumblers, and silk stock- 
ings, under the normal conditions of use, in effect may be said 
to “die” when they “burn out,” “crack,” or “run.” The data 
and analysis necessary to obtain an estimate of the mortality 
distribution, life expectancy, and other similar character- 
istics of such equipment, are exactly analogous to the more 
familiar techniques applied in the case of human mortality 
experience. This paper presents the results of an analysis of a 
service test that was conducted in order to estimate the mean 
lengths of life for each of two types of glass tumbler when 
used in a particular cafeteria, and discusses statistical tech- 
niques that proved to be well-suited for the treatment of the 
problem. Technological considerations of a model for tumbler 
breakage are given, leading to familiar mortality curves of 
Makeham-Gompertz type. 

2. The service test.* A fixed number of tumblers of each of 
two types, called “annealed” and “toughened,” were kept in 
service at all times in the test cafeteria. At the end of each 
week, each broken tumbler was replaced by a new one of the 
same type. A record was kept of the date each tumbler was 
introduced into service and of the week each broken tumbler 
was removed from service. The test was continued for 78 
weeks employing 60 annealed tumblers and 120 toughened 
tumblers. 


II. THE SERVICE TEST ANALYSIS 


1. Discussion. What complicates the analysis of this experiment is 
the fact that the (necessarily) finite nature of the experiment leads to 
truncated samples. This unpleasant feature might be avoided by ex- 


* The service test was conducted by H. Scholtz and A. Basham, of the Preston Laboratories. The 
work was sponsored by the Libbey Glass Company, the Federal Glass Company, and the Preston 
Laboratories. Acknowledgment is due to T. Collins and H. Black, H. H. Blau and Conrad Stone, and 
F. W. Preston, as the representatives cf these three companies who provided impetus and guidance for 


the test program. 
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cluding from the analysis any generations not completely extinct by 
the end of the experiment. The procedure would, however, be ex- 
tremely wasteful in the annealed case, and would be fatal in the 
toughened case, since practically no data would be retained. A further 
difficulty is the fact that each subsequent generation of replacements 
has its history truncated relatively earlier than the preceding genera- 
tions. Hence, except for direct estimation of the stable replacement 
rate by averaging weekly breakages over a sufficiently long experiment, 
any estimates must be based on an assumed analytical form for the 
mortality distribution. 

The method described below makes effective use of al] the data and 
supplies essentially optimum estimates of the parameters, provided 
only that the conditions of handling of the tumblers may be considered 
homogeneous from week to week. Mean lengths of life were estimated 
as 8.7 and 60.4 weeks for the annealed and toughened tumblers, re- 
spectively. 

2. Statistical method. Consider a batch of NV, tumblers, introduced 
together into the experiment at the beginning of the a** week. Let nai 
be the number of tumblers of this group broken in the 7** week follow- 
ing their introduction. Then 


N ai - Na — > nes 


i<i 


is the number of individuals of this group surviving i—1 weeks, hence 
eligible for possible breakage in the i*® week. Then nq; is an observation 
on a sample of N,; individuals distributed according to the binomial 
distribution with probability equal to the conditional] probability of 
breakage in the zt week of service, assuming survival of i—1 weeks of 
service. The aggregate of all such samples for successive values of 2, 
terminating either when N,;=0 or when the last week of the experi- 
ment terminates the observations, contains all the available informa- 
tion on the history of the generation in question. Moreover the succes- 
sive n.;, for given Nq;, are mutually independent in probability. 

Let the mortality distribution be f(x)dz, i.e., the probability of 
breakage in the 7** week since introduction is given by 


P; = J seve. 


Then it is easily seen that the conditional probability relevant to the 
samples considered above is given by 
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J “Saaz f f(x)dx 








f  f{a)dz 1 — Ta)de 


Assuming homogeneity of experimental conditions from week to week 
the samples from the various generations may be pooled by summation, 
so that 


Ni =D) Na and n= Do na 


provide a sample on the binomial distribution with probability P,¢. 

The method of “maximum likelihood” furnishes the key to the prob- 
lems of estimation and testing concerned with the distribution f(z)dz. | 
The quantities P;* depend directly on the form of f(z) and on the 
parameters involved. The joint probability of obtaining the observa- 
tions n; for given N; can be evaluated as a function of the P,*, hence 
as a function of the basic parameters. The parameter values which | 
maximize the joint probability are essentially optimum estimates of | 
the parameters. The ratio of this maximum joint probability to the 
maximum obtainable with arbitrary P;* measures the “goodness of fit,” 
therefore supplies a criterion for testing the form of f(x). Other statisti- 
cal tests can be based on maxima obtained by holding some of the 
parameters fixed and maximizing with respect to the remainder. 

Significance tests based on the maximum likelihood method are, 
except in certain special instances, generally based on asymptotic 
approximations. In the cases under consideration here no appreciable 
loss results from replacing the likelihood ratio by the asymptotically 
equivalent x? corresponding to the Pearson test for goodness of fit, i.e., 


. (ni — NPs)? 
Te oe SL 
i N;P (1 ~— P<‘) 





Let f(x) depend on parameters 6;, 02,---, 9, and assume that | 


samples are included from a total of k basic intervals, with k>p. De- 
note by @ the parameters values which minimize x2(0:, 02,--+ , 45). 


Then the @ values are estimates of the corresponding parameters, 
and 

x*-p - x?(hi, bs, dum: 6,) | 
is distributed approximately according to the x? distribution with 
k—p degrees of freedom, provided f(z) actually has the functional 
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form assumed. Thus an excessive value of x,_,? would indicate a poor 
fit to the assumed form. Now consider a fixed value 6, and let 
8, -+ +, 6, represent the new values which minimize x? with @, fixed. 


x7x—p41( 41) = x?(41, Be, sik Bi 6,) 


has, approximately, the x? distribution with k—p+1 degrees of free- 
dom, and 


x7(01) = x%e-pat — X7e-p 


is independent of xx_,?, and distributed like x? with one degree of free- 
dom. In this manner, the standard error of estimate for 6; may itself 
be estimated, by varying @; around 4, determining 6., - - - , 6, for the 
given 6,, and extracting the square root of the quantity x:°(@:). This 
procedure corresponds to the frequently used “fiducial argument.” 

3. Application of the Method. In the present instance there was theo- 
retical justification for trying the incomplete gamma function dis- 
tribution! 


rs 


f(z)dz = Tm 





x*—le— «dz. 


No obvious direct calculation exists, except in the special case u.=1, 
for determining the parameter values which best fit the data. For- 
tunately the incomplete gamma function has been tabled (K. Pearson, 
Biometrika, 1922), in the form 


1 worl 
I(u, = ——-— f e~*v dv. 
?) rip +1)4 0 


It is therefore possible to carry out an empirical] minimization of x? by 
successive approximations. In the neighborhood of the minimum one 
can proceed by minimizing with respect to d for fixed u, then minimiz- 
ing with respect to uw using the A value obtained by the previous process, 
etc. The process is laborious, since each minimization requires the com- 
putation of several complete trials, but feasible, for lack of a better 
procedure. 

In analyzing the annealed tumblers, the basic unit of one week was 
preserved, up to 20 weeks, pooling weeks 21-25 and weeks 26-30 be- 
cause of the small sample sizes in these groups. The one week sub- 

1A discussion, with applications, of the renewal theory based on this distribution function may be 
found in a paper by A. W. Brown entitled “A Note on the Useof a Pearson Type III Function in Re- 


newal Theory” published in The Annals of Mathematical Statistics, Vol. XI, December 1940, pps. 448- 
453. 
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division was unnecessarily fine for the toughened tumblers, because 
of their considerably longer life, so that the data were grouped accord- 
ing to five-week units, except for the pooling of weeks 56-65 and 66-75. 
The resultant reduction in labor more than justified the trivial Joss of 
information associated with the grouping. 

The accompanying sample computations (Tables 1, 2) show the 


TABLE 1 











Exposed Broken 





Week n/N F; Pf oO ry 8/e 
N n 
1 549 23 .042 .0410 .041 .0086 .001 12 
2 §21 38 .073 1127 .07 -O115 — .002 —.17 
3 470 48 -102 .1934 -091 .0133 O11 -83 
4 415 40 .096 2763 103 0149 — .007 — .47 
5 371 36 .097 .3564 -lll -0163 —.014 — .86 
6 331 42 -127 -4317 -117 -0177 -010 56 
7 285 35 -123 .5008 -122 .0194 -001 -05 
8 247 33 -134 -5638 -126 -0211 -008 -38 
9 210 26 -124 .6199 -129 .0231 — .005 —.22 
10 182 7 .093 -6698 -131 .025 — .038 —1.52 
11 163 22 -135 -7145 -134 -0267 .001 .04 
12 139 16 -115 -7537 -136 -0291 — .021 —.72 
13 123 8 -065 -7876 -138 -0311 — .073 —2.35 
14 114 19 -167 8174 -140 -3026 .027 .83 
15 95 17 -17 -8432 -141 -0357 .038 1.06 
16 78 13 - 167 .8656 .143 -0396 -024 .61 
17 65 8 -123 .8850 144 -0436 — .021 — .48 
18 56 7 125 -9016 145 .0470 — .020 — .43 
19 49 12 -245 -9161 146 .0504 .099 1.96 
20 37 5 -135 -9284 147 .0582 —.012 —.21 
21-25 31 7 .548 .9680 -553 .0893 — .005 — .06 
26-30 14 10 714 -9860 .563 -1325 -151 1.14 
x?= £(8/c)? =18.3 
(1 —P**) 
P; =fi*f(x)dz (From tables of the incomplete gamma function) o a 
N 
Fy -Fy-n ” 
Pé =———_ 3 =— —P;< 
1—Fye N 





calculations corresponding to the optimum parameter values for the 
two types of tumblers. Figures 1 and 3 show the corresponding 
theoretical mortality distribution and the theoretical curve of condi- 
tional probability by weeks, for the annealed tumblers; the curves of 
Figures 2 and 4 are the corresponding curves for the toughened 
tumblers, by five-week groups. 

The small minimum x? values obtained indicate that in both cases 
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the gamma distribution fits the data very well. In the annealed case, 
the value of 18.3 on 20 degrees of freedom is within the 50% point, 
that is, under random sampling from gamma function distributions the 
value of x39 would exceed the observed value more than 50% of the 
time. In the toughened case the value of 7.0 on 11 degrees of freedom 
would be exceeded about 80% of the time under random sampling. 


TABLE 2 
TOUGHENED CASE (A =.0257, » =1.55) 








Exposed Broken 





Weeks N " n/N F; Ps a 8 be 
1- 5 241 7 .029 .0279 .028 -0106 .001 .09 
6-10 223 11 .049 0757 .049 0145 .600 .00 

11-15 208 10 .048 -1315 .060 .0165 — .012 =— 7% 

16-20 188 9 048 1907 .068 .0184 — .020 —1.09 

21-25 170 1 -100 -2505 .074 .0201 .026 1.29 

26-30 146 10 068 3093 .078 0222 —.010 — .45 

31-35 132 13 .098 .3660 -082 .0239 .016 -67 

36-40 110 9 .082 .4199 -085 .0266 — .003 — .ll 

41-45 94 6 .064 -4704 .087 .0291 — .023 — .79 

46-50 86 9 -105 .5179 -090 -0309 .015 .49 

51-55 73 5 .068 5625 .093 .0338 — .025 — .74 

56-65 60 14 233 -6411 181 .0497 .052 1.05 

66-75 45 10 .222 .7073 .184 .0578 .038 .66 


x? = 2(5/c)* =7.0 


i PE(1—P¥) 
nf {(z)dx (From tables of the incomplete gamma function) -4/ 
° N 


Fy -—Fis n 


<a § =— —P,¢ 


1—Fy N 








P<é = 








Standard errors of estimate for the mean length of life were approxi- 
mated by a minimization procedure involving fixed values for the 
mean. Thus, in the annealed case, the optimum values were m=8.7, 
¢=6.9, yielding x?=18.3. Choosing m=9.2 led to a minimum x? of 
23.4 at o=7.6 and the choice of m=8.2 led to a minimum x? of 22.4 at 
o=6.4. Now 1/23.4—18.3=1/5.1=2.3 and 1/22.4—18.3 =+/4.1=2.0, 
so that the increment of .5 week is equivalent to about 2 standard 
errors, or the estimated standard error is } week. Similarly, in the 
toughened case the optimum values were m=60.4, ¢=48.5, with a x? 
of 7.0. Minimum x? values corresponding to m=65 and m=55 were 
8.4 and 8.5 respectively, leading to the conclusion that a five-week 
interval is about 1.2 standard errors, in other words, the estimated 
standard error is of the order of four weeks. 
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4. Summary of test results. Technological considerations suggested 


the incomplete gamma distribution, 
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f(z)dz = Tu) z—le2dz (u > 0,’ > 0) 


\u 





closely related to the Pearson Type III curve, for the mortality dis- 
tribution. Statistical analysis indicates that the data for both types of 
tumbler are consistent with this form of distribution. 

The mean and standard deviation of the above distribution are given 
by m=n/\ and c=vV/u/Xd. Estimates of the parameters for the two 
populations are presented in Table 3 following. 


TABLE 3 
ESTIMATES OF PARAMETERS 











Tumbler | m o | | “ 
Annealed 8.7 6.9 183 1.60 
Toughened 60.4 48.5 | 0257 1.55 





The estimated mean lengths of life, 8.7 weeks for the annealed, and 
60.4 weeks for the toughened tumblers, are subject to uncertainties measured 
by standard errors of approximately 0.25 and 4.0 weeks, respectively. 

The close agreement between the estimates of the exponent yu for the 
two cases is of some interest. The analysis does not indicate that the 
difference is statistically significant. Certainly the slight difference is 
of no practical importance. Further investigation is required to de- 
termine whether or not the observed value may be more generally 
applicable. 

5. Tumbler mortality distribution. The technical considerations be- 
hind the choice of the form of the mortality distribution to be used 
in the analysis of the service test data are discussed in Part III, foilow- 
ing. 

Evidence is given to indicate that tumblers may be expected to 
break in service in accordance with the mortality distribution 


dF = — etaTor-te—ail(q,T — dT 


where 
T =e %, 


and where ao, a, and 8 have a definite technological significance. It is 
noted that this distribution resembles the incomplete gamma distribu- 
tion somewhat, and represents the familiar Makeham-Gompertz 
formula if 8 is negative. 

It is likely that the three-parameter distribution would fit th. service 
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test data at least as well as did the two-parameter incomplete gamma 
distribution that was used in the actual analysis. No attempt was made 
to test this possibility. The estimation methods that were used are 
equally applicable to either form of distribution, but would involve 
much more tedious computations for the three-parameter form of dis- 
tribution than for the simpler incomplete gamma form. 


III. TUMBLER MORTALITY DISTRIBUTIONS 


1. Technological background.? Tumblers break in service for a variety 
of reasons. One primary cause of breakage is impact at the rim. 
The force of impact required to break a tumbler with a blow at the 
rim will depend upon the thickness of the glass, and upon very many 
other factors but in particular it has been found to depend upon the 
extent of abrasion of the rim surface. 

There is good evidence that tumblers ‘‘age”’ in service, and in partic- 
ular that they are abraded around the rim sufficiently to weaken their 
resistance to breakage on impact. For the purpose of this report the 
theoretical model for breakage is based on the assumption that tum- 
blers in service are subjected continually to wear and impact, but that 
the rate of wear at any time during the service life is proportional to the 
“unwear” at that time. 

For example, suppose that abrasion of the rim were the only form 
of wear involved, and that each point along the rim of a specific tum- 
bler at a definite time may be classified either as abraded or unabraded. 
If liability to abrasion is the same for each rim point of each tumbler 
in service then the fraction of the rim which may be expected to be 
abraded at time ¢ is 


A(t) = 1 — e*, 


where 8 is the proportionality factor in the rate of abrasion. The as- 
sumed exponential aging function A(t) is illustrated by this rim abra- 
sion example, but obviously applies much more generally to aging 
processes of all sorts which satisfy the basic assumption that rate of 
wear is proportional to unwear. 

Returning to the rim example, it seems reasonable to assume that 
the probability that a tumbler will break, between times ¢ and ¢+-At, 
under a blow at the rim is 


[ui(1 _ A(t)) + poA (t) |uoAt = [ko _ kye~* | At, 


2 We are indebted to F. W. Preston and J. Glathart, of the Preston Laboratories, for assistance 
with the construction of the general technological model adopted for this discussion. 
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where yu; and y2 are the probabilities that a blow on the unabraded and 
abraded portions, respectively, will break the tumbler, and where 
p,At is the probability of a blow during the interval At. There will cer- 
tainly be different kinds of abrasion in practice, with different effects 
on the probabilities of breakage, but the simple model described is the 
only one considered in the present report and may very possibly repre- 
sent reality to a good approximation. 

2. Mortality distributions. The technological assumptions of Section 
1, preceding, lead immediately to a mortality distribution. The proba- 
bility q(¢) that a tumbler survives to at least time ¢ satisfies the equa- 
tion 

q(t + At) = g(t)[1 — (ko — kie~**)At]. 


Thus 
q(t) = eFt/Be—kote(—k1/B)e~ Bt, 
The mortality distribution is determined by the equation 
df= —dq= [e*1/Be—kote(—k1/8)e Bt fe, os kye~**) Jat. 
For convenience, set 
T=e% aS =k, and mS = ky. 
Then 
df = — e™Te-le-=1T (q,T — ao)dT = F(T)aT, 
and 
q(t) = es Pare-m? = Q(T). 


A special case of interest is that in which a blow on the unabraded 
portion has zero probability of breakage; in this case nu. =0, so kh=h 
and ap=a,=a, whence 


F(T)dT = — ae*T*"'e-27(T — 1)dT 
and 
Q(T) = e*T%e-eT, 
3. Discussion. The mortality distribution, written in terms of the 


original technological quantities ¢, 8, uo, wu, and pe, is 


df = — FO e(uo—ardn0/8 7 (wos /8)—1e— (tn) 00/617 [ (y9 _ 1) T ~— uo |d7' 
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where 


T =e and dT = — Be-*dt. 


It is to be noted also that the quantities a, and ap are related tothe 
technological quantities by the relations: 


(u2 = H1) Mo Moe 
a, = ——.__ and ap = . 





Appropriate tumbler tests, such as the cafeteria service test dis- 
cussed in Part II, preceding, provide statistical data which may be 
analyzed to yield estimates of the technological quantities piyo, woo, 
and 8. For statistical purposes, it is more convenient to first estimate 
the distribution parameters oo, a1, and 8 which then in turn determine 
the estimated values of the technological quantities. Computational 
procedures which may be used for the estimation of such parameters 
are described in detail in Part 4I. 

Actually, the analysis of the cafeteria service test data was carried 
out on the assumption that the mortality distribution was an incom- 
plete gamma distribution. The similarity between the two distributions 
is shown when df is written in the form 


df = Bei{aol—8t/Int]o—artle~Bt/ 4) | eB inten a |dt. 


If the quantities which are shown in braces were all nearly constant over 
the range of values of ¢ involved in the test then the technological dis- 
tribution would be essentially the same as the incomplete gamma dis- 
tribution which was used in Part IT. 

An extensive discussion of the application of the Makeham-Gom- 
pertz survival function to problems of industrial replacement is given 
by Kurtz,? who shows empirically that a number of types of industrial 
property can by fitted in this manner or with the Pearson Type I dis- 
tribution. 

4. The incomplete gamma distribulion. An interesting technological 
model for tumbler breakage is obtained‘ if it is assumed that a tumbler 
breaks when it is bumped the n* time, and that the number of bumps 
is proportional to the time in service. In this case, it is appropriate to 
use the Poisson distribution for the number of bumps the tumbler re- 
ceives, to determine the survival function q(t) as follows 


* Edwin B. Kurtz, Life Expectancy of Physical Property, New York (1930). 
‘We are indebted to C. P. Winsor for calling this technological model to our attention 
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i) 
q(t) =1—e*'D) (adi. 
In this expression for q(t) the quantity 
e—*(n1) */2! 
represents the probability that the tumbler will be bumped exactly 7 


times in time ¢t. The mortality distribution dg is obtained directly from 
the survival function q(t) by differentiation, thus 


n 


= — i™—le—tdt, 
(n — 1)! 





dg = — dq 


This distribution is exactly the incomplete gamma distribution, pro- 
vided the critical number n of bumps required to break a tumbler is 
limited to non-negative integral values. 

It is interesting to note from Part II, preceding, that the estimated 
values for the parameter n, there not limited to integral values, were 
1.55 and 1.60 for the two types of tumblers tested. This provides some 
evidence that the simple bump model is not adequate, even though 
the incomplete gamma function did fit the experimental data quite 
well. 
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THE RELATIVE POWER OF THREE STATISTICS FOR 
SMALL SAMPLE DESTRUCTIVE TESTS 


Pau. H. JAcoBsON 
Statistical Bureau, Metropolitan Life Insurance Company 


The probability distributions are given for the mean of the 
three smallest ordered observations (m) and for the median 
of all observations (mga) in samples of size five drawn from a 
normal distribution with unit variance and zero mean. 

The Neyman-Pearson theory is used to demonstrate the 
extent by which X affords a more powerful test method than 
m or mg for testing the hypothesis that u. =0 under the alterna- 
tive hypothesis that 1 >0. Also, m is shown to be superior to 
ma, and both are evaluated in terms of the power of X in 
samples of size three and four. 

These results are applied to destructive tests, and a special 
kind of problem is cited for which there are economic grounds 
for the use of m, 


HE arithmetic average of all observations in a sample (X) is the 

best estimate of the unknown mean value of a normally distributed 
variate. In practice, however, there may be economic grounds or op- 
erating advantages for the use of some other statistic for this purpose. 
In samples of size five, for example, the mean of the three smallest or- 
dered observations (m) and the median of all five observations (mz) 
may be used as alternative statistics in special circumstances. 

It is the purpose of this paper (i) to use the Neyman-Pearson power 
theory to determine the relative efficiency of the three statistics, X, 
m, and mg; and (ii) to apply these results to the choice of m or mz, 
in a special type of small sample destructive test. 


STATISTICAL STATEMENT OF THE PROBLEM 


The quality (X) of each unit of a product is a chance variate, nor- 
mally distributed with unit variance around an unknown mean (jy). 
Good lots of the product are defined as those for which » $0, bad lots 
are those for which u >0; where uz is the actual, but unknown, average 
quality of all units from which the lots are drawn. 

The consumer tests five units drawn at random from a lot in order to 
obtain information concerning yu. He desires to fix the probability of 
rejecting lots for which un =0 at a, but he is confronted with the question 
of choosing one of many possible statistics for use in accepting or re- 
jecting lots. Three such statistics—X, m, and ma, will be discussed here. 
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Sampling Errors. A decision concerning a lot based on a sample 
from the lot is subject to two types of error. A good lot may sometimes 
yield a bad sample, in which case an acceptable lot is rejected. The 
probability of making this type of error (I) is commonly called the 
producer’s risk. Similarly a bad lot may sometimes yield a good sample, 
in which case a lot below specified quality is accepted. The probability 
of making the latter type of error (II) is referred to as the conswmer’s 
risk. These two types of error may be summarized as follows: 


Decision made is that 








True situation oat u>o 
(good quality) (poor quality 
» =0 (good quality) Correct Type I error 


(producer's risk) 


wu >0 (poor quality) Type II error Correct 
(consumer's risk) 


Naturally these errors should be kept to a minimum, but it must 
be realized that whenever decisions are made on the basis of samples 
there will always be some risk that these errors will be committed. 
If the sample is of predetermined size (five units), according to the 
theory first advanced by Jerzy Neyman and Egon S. Pearson, we should 
use that procedure which minimizes the probability of type II errors 
while the probability of type I errors is kept constant. 

Thus, having fixed the probability of committing type I errors if 
u=0, it is necessary to obtain for X, m, and m, their respective prob- 
abilities of avoiding type II errors. For this purpose, however, the 
probability distributions of these three statistics are necessary. Since 
the distribution of m is not available in the literature, it had to be 
calculated. And, though the distribution of mz has been published, the 
author was unable to find its tabulation. 

Formula for f(m). Let X:, X2, and X; be the three smallest ordered 
observations in a sample of five drawn from a normal distribution, 
so that X,;5 X,.< X3. And let 


1 
o(X) = T= exp (— 4X) 


V2r 
x 
f o(u)du. 





6(X) 








And, if we put 
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SMALL SAMPLE DESTRUCTIVE TESTS 
m= (X, a Xe a X3 —_ 3u)/30 


where » and o are the arithmetic mean and standard deviation of the 
population distribution. 
Then the a density function! of m is given by 


ro) = Bacvim f[0(2, 


Expected value of m. Let M be the expected value of m. Then, ac- 
cording to Landau 


M = — 5/\/x* arctan 1/\/2 = — 0.55266, to five places. 





1 
) ™ =| $(V3/2 t) &2(—m—t)dl. 


Therefore, for any mean (xz) and standard deviation (c) 


M = uw — 0.55266c. 
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FIGURE 1 


DISTRIBUTIONS OF MEANS IN SAMPLES OF FIVE (Xs), SAMPLES OF FOUR (X,), 
AND SAMPLES OF THREE (X:;), MEAN OF THREE SMALLEST OBSERVATIONS IN SAM- 
PLES OF FIVE (m), AND MEDIAN IN SAMPLES OF FIVE (ma), DRAWN FROM A NORMAL 
DISTRIBUTION WITH ZERO MEAN AND UNIT VARIANCE. 


1 The author is indebted to H. G. Landau, Ballistic Research Laboratories, Aberdeen Proving 


Ground, Md., for the formulas for /(m) and M, the derivations of which will appear in a future publica- 
tion. 
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Distribution of m. The probability density function of m is shown in 
Fig. 1.2 The cumulative distribution, correct to approximately three 
decimal places, is given in Table 1. 

The distribution of m is only slightly more concentrated than the 
distribution X, (mean in samples of four observations), but it is less 
concentrated than the distribution of Xs (mean in samples of five 
observations). Also, its curve is only slightly asymmetrical. (It should 


TABLE 1 


DISTRIBUTION OF MEANS OF THREE SMALLEST ORDERED 
OBSERVATIONS IN SAMPLES OF FIVE (m) DRAWN 
FROM A NORMAL DISTRIBUTION WITH ZERO 
MEAN AND UNIT VARIANCE 










































z P(m 22) z 2(m >z) 
—2.55 1.0000 —0.55 - 5007 
—2.45 .9999 —0.45 .4203 
—2.35 .9998 —0.35 .3429 
—2.25 .9996 —0.25 .2714 
—2.15 -9992 —0.15 -2080 

—0.05 - 1542 
—2.05 -9985 
—1.95 9972 0.05 .1104 
—1.85 -9951 0.15 .0763 
—-1.75 -9915 0.25 -0508 
—1.65 -9858 0.35 -0325 
—1.55 -9770 0.45 -0201 
—1.45 -9639 0.55 .0119 
—1.35 -9455 0.65 .0068 
—1.25 -9200 0.75 -0037 
—1.15 - 8861 0.85 -0019 
—1.05 -8431 0.95 .0010 
—0.95 -7903 1.05 -6005 
—0.85 -7281 1.15 .0002 
—0.75 -6575 1.25 -0001 
—0.65 . 5808 1.35 .0000 

















Note: For the mean of the three lar yest ordered observations in samples of five ( ), 
P(n 22) =1—P(m2 —2). 
For example, P(m 21.15) =1 —.8861 =.1139. 


be noted that the distribution of the three largest ordered observations 
in samples of five (#i) is equa: to the m distribution reversed about the 
»=0 axis. In other words, the expected value equals 0.55+, the height 
of the maximum ordinate is the same, and the curve is slightly skewed 
to the left.) 

It is also interesting to note that the m distribution if plotted on 
normal probability paper deviates only very slightly from a straight 


2? Grateful acknowledgment is made to Milton Feier for construction of the charts. 
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line which characterizes a normal distribution drawn on the same paper. 

Formula for {(mz). Let f(mz) be the distribution of the median values 
in samples of five observations drawn from a normal distribution. 
Then, from Wilks* 


f(me) = 306%(X)[1 — (X)]'¢(X) 


where $(X) and ¢(X) are defined as for f(m). 
Distribution of mz. The probability density function of m, is shown in 


TABLE 2 


DISTRIBUTION OF MEDIANS (mg) IN SAMPLES OF FIVE 
DRAWN FROM A NORMAL DISTRIBUTION WITH 
ZERO MEAN AND UNIT VARIANCE 











z P(mg22) z P(mg2z 
0.000 . 5000 1.075 0225 
0.025 .4813 1.125 0180 
0.075 .4441 1.175 0143 
0.125 .4073 1.225 0113 
0.175 .3714 1.275 0088 
0.225 -3365 1.325 -0069 
0.275 .3031 1.375 .0053 
0.325 -2712 1.425 -0041 
0.375 .2411 1.475 -0031 
0.425 -2129 1.525 -0023 
0.475 . 1868 1.575 -0017 
0.525 . 1627 1.625 -0013 
0.575 . 1408 1.€75 0010 
0.625 -1210 1.725 -0007 
C.675 - 1033 1.775 -0005 
0.725 -0875 1.825 .0004 
0.775 .0736 1.875 .C003 
0.825 -0615 1.925 .0002 
0.875 -0510 1.975 -0001 
0.925 .0420 2.025 .0001 
0.975 .0344 2.075 00v01 
1.025 .0279 2.125 .0000 








Fig. 1. The cumulative distribution, obtained approximately by the 
addition of ordinates (tangent formula), is given in Table 2.‘ The error 
involved for these values was estimated’ to be less than two in the 
fourth decimal place. Other values, correct to three decimal places, 
may be obtained by interpolation of these tabulated values. 

? Wilks, S. S., Mathematical Statistics, Princeton University Press, 1943, p. 91. 

‘ The author is indebted to Howard Levene for suggesting the method of deriving the numerical 


values for the distribution of my and for other valuable as=ustance in the preparation of this paper. 
* Steffensen, J. F., Jnterpoiction, Baltimore, 1927, p. 159. 
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The distribution of mz is symmetrical about »=0. Also, as may be 
seen from Fig. 1, the height of its maximum ordinate falls between 
those of X, and X; (mean in samples of three observations). The dis- 
tribution of ma, thus, is less concentrated than m, and both distribu- 
tions are less concentrated than Xs. 

Hojo‘ has shown that the median is very nearly normal in all cases, 
and that oma=.53557 for N =5. These values expressed in terms of the 
normal probability integral compare favorably with those presented 
in Table 2; the maximum difference occurs at the point of inflection 
where the error is less than one in the third decimal place.’ 


POWER OF STATISTICS 


With their distributions available, it is now possible to compute the 
power curves for m amd mg. In order to evaluate their efficiency, they 
will be compared with the power curves for X5, X4, and X3, the means 
in samples of size 5, 4, and 3 respectively. Since the consumer is in- 
different to obtaining quality better than he has contracted for, he is 
not concerned with the possibilities that 4<0. The critical regions (the 
regions of rejection if 1.=0), which have been set at size a, are for 
a=.106:5 _ 

X5 


™m 


~) 
' 
> 
co 
IV 


.722 


.559 X, => .625 
.063 ma = .667 


IV 


IV 


The power curves, representing the probability of rejection if 10, 
are shown in Fig. 2. It is apparent from that figure that for all u.>0 
(bad lots) the test using X5 is more powerful than the test using any 
of the other four statistics. This result is well known. It should be noted, 
however, that while the test using m is only very slightly more powerful 
than the one using X,4, it does have a decided superiority over the 
test using m,; the latter’s power curve falls approximately midway 

‘Hojo, Tokishige, “Distibution of the Median, Quartiles and Interquartile Distance in Samples 
from A Normal Population,” Biometrika, 23: 315-336, 1931. 

? After this paper had been prepared, Landau advised the author that the distribution of mg can 


be derived from the normal probability integral without any approximation. In a future publication, he 


will show that 
P(maq =z) =3(X) [10 —156(X) +662(X) ] 


: 10 v3 3 js aati 
omg = 1 —-— 1 —— arctan V : = (0.53558). 


r a 


and that 


8 Critical regions of size a =.106 are used in this paper for convenience since some of the computa- 
tions were available for regions of this size. While the actual power of these statistics varies with the size 
of their critical regions, thei: relative power is the same for regions of any given size 

The regions for Xs, Xs, and Xs are derived from 
X j(utro/ VN), where up =0, ¢ =1, A =1.25 (since a =.106), and N is the size of the sample; those for 
m and mg are interpolated from Tables 1 and 2 respectively. 
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between those for X4 and X3. In the absence of special circumstances, 
therefore, if samples consist of five observations the statistics m and mg 
should not be used for testing the hypothesis n = 0 under the alternative 
hypothesis that «>0 because their use is less favorable than the use 
of X for all »>0. 
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FIGURE 2 


POWER CURVES FOR MEANS IN SAMPLES OF FIVE (X,), SAMPLES OF FOUR (Xu), 
AND SAMPLES OF THREE (X:), MEAN OF THREE SMALLEST OBSERVATIONS IN SAM- 
PLES OF FIVE (m), AND MEDIAN IN SAMPLES OF FIVE (mg), DRAWN FROM A NORMAL 
DISTRIBUTION WITH ZERO MEAN AND UNIT VARIANCE. 


APPLICATION TO DESTRUCTIVE TESTING 


In applying these results, two types of situations in which all units 
of the sample are tested simultaneously will be considered: 

(1) All units are destroyed simultaneously when tested, as for 
example when a fragmentation shell is exploded in the midst of 
five panels; 

(2) Destruction varies with increasing test intensity, as for example 
when five fuses or radio tubes are placed in a testing panel, 
the current (or voltage, pressure, etc.) is gradually increased, 
one by one the items “blow” and the level of “blowing” is 
automatically recorded; the items are expensive and are prac- 
tically as good as new so long as they do not “blow.” 
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Situation 1. This situation will be used to illustrate an example of 
fallacious statistical reasoning with which the author was confronted 
during the war. 

A consumer contracted to purchase a protective panel which would 
permit 4 =0 complete penetrations when tested under controlled con- 
ditions by means of the explosion of a fragmentation shell. The num- 
ber of penetrations per panel was essentially normally distributed with 
variance o?=1. Lots were rejected by the consumer if X in a sample 
of five panels was equal to or greater than k,/\/5; where k; was 1.173, 
specified by the probability of rejecting good lots (.120). 

When a second contract was being negotiated, the consumer sug- 
gested that he be permitted to reject lots for which the mean of the 
three largest ordered observations in the test sample (#7) was equal 
to or greater than k;, on the grounds that he required greater protec- 
tion against accepting bad lots. Does 72k, afford greater protection 
against committing type II errors than does Xs=hi/\/5? 

By interpolation of the values given in Table 1, P(m2k,) is found 
to be .106. The power curve for m with P(type I errors) equal to .106 
is given in Fig. 2. Since this curve, which is approximately similar to 
the one for m, is below the one for Xs with P(type I errors) =.106, 
it is obviously even less powerful than the curve for Xs with P(type 
I errors) =.120 for all 1 >0. In other words, though somewhat fewer 
lots of acceptable quality may be rejected through the use of m2h,, 
a much greater number of bad lots may be accepted. The consumer is 
definitely getting less protection against acceptance of bad lots. 

Moreover, if the consumer is pressed for additional quantities of 
the product and must increase the chances of accepting bad lots, it 
would not be efficient to change from Xs=hi/\/5 to m=, but rather 
to Xs2he/V/5 (=u4+1.250//5). This follows from Fig. 2 where it is 
shown that with P(type I errors) =.106, Xs is more powerful than m 
(or m) for all u>O. In other -vords, if P(type I errors) for Xs=ki//5 
is too large, m= ky is a less efficient means of effecting a reduction than 
Xs2=k2/\/5, where kz is determined by P(type I errors) for #= hi. 

How many more incorrect decisions the use of m (or m) rather than 
Xs is likely to lead to in the long run if a=.106 may be seen from Fig. 2. 
For example, if the deviation from the specified quality is only slight, 
u=0.1, both statistics would be likely to lead to numerous incorrect 
decisions; but the use of m would result in four errors more per 1,000 
decisions than would the use of Xs. If the deviations were substantial, 
0.6S54%51.2, both statistics would correctly lead to rejection of the 
majority of bad lots, but decisions using Xs would be correct on five 
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to six per cent more of these lots than would those using m. Finally, 
for lots of extremely poor quality, 1.=1.8, the test using X¥; would 
incorrectly accept only three of every 1,000 of these bad lots but the 
test using m would accept as many as nine of them. 

Therefore, if all five units in a sample are destroyed simultaneously 
when tested, the mean of all observations in the sample affords a 
more powerful test of the unknown mean value of the lot than does the 
mean of the three smallest (or three largest) ordered observations in 
the sample.°® 

Situation 2. Under special conditions, such as the destructive testing 
of expensive products, it may not be economical to make decisions 
concerning a lot on the basis of all informaticn which could be derived 
from the test sample. Let us take the example previously cited of five 
fuses or radio tubes which are being tested to determine their level of 
“blowing.” In this situation, all five units are tested simultaneously 
but they are not destroyed simultaneously. Since the units of the 
product are practically as good as new so long as they do not “blow,” 
the test may be stopped as soon as three units have been destroyed. 
Would m or mz, be the preferred statistic in this test situation? How 
many observations is this preferred statistic really worth? 

As may be seen from Fig. 2, in this situation m, should not be used 
for testing the hypothesis » =0 under the alternative hypothesis that 
u>O because its use is less favorable than the use of m for all u>0. 
It should also be noted that for all u>0, m is at least as powerful as 
X,. In other words, merely by placing two extra fuses in the panel, the 
tester obtains information equivalent to four units at a cost of only three 
units. 

In order to further clarify the application of m, let us take a hypo- 
thetical problem. A consumer has contracted to purchase a fuse which 
will “blow” for a load of 10 amperes. The level of “blowing” is known 
to be normally distributed with o=1 ampere. The fuse is expensive, 
and the consumer is indifferent to accepting lots for which the level 
of “blowing” is less than 10 amperes, but he must assure himself 
against accepting lots for which the level is materially greater than 
10. If he sets the probability of rejecting good lots at .106 and if he 
can tolerate probabilities of accepting bad lots approximately equal 
to those obtained by X,= 10.625 amperes (see Fig. 2), he can stop the 


* This statement is based on the assumption that the characteristic of the product which is being 
tested is normally distributed. If there was reason to suspect that the underlying distribution was 
skewed to the right, neither m nor Xs would be a valid test since both are based on the assumption of 
normality. 
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test with the “blowing” of the third fuse and base his decisions on 
m= 10.063. Thus with a testing cost of only three fuses he can obtain 
at least as much information as if he had destroyed four. 

It should be noted that even if mz, is used in this situation, the tester 
gains the equivalent of one-half of an observation over the use of X. 
If the testing apparatus could not record the level of “blowing” for 
each unit but could record the level at which the test was stopped 
after the third unit had been destroyed, then m, would have to be used. 
In the absence of such additional conditions, however, m,z should not 
be used in place of m. 
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SIMPLIFIED METHODS OF FITTING CERTAIN TYPES 
OF GROWTH CURVES 


Dubey J. CowpEN 
University of North Carolina 


A method of selected points and a graphic method is pre- 
sented for fitting the modified exponential, the Gompertz, and 
the logistic. The procedure is illustrated for the logistic. 


HREE well-known types of curves may be fitted by methods pre- 
sented in this paper. 


Modified exponential.................. Y¥e=k+ab*; 
Gompertz.......... Cdictemees ssceee Ve @ BAY: 
ere . k 
INS 0 a iiiein die wanduite Kiawns riecey ae 14 10" 


In each of these curves, k is an asymptotic value, ordinarily the upper 
asymptote. The values of constants designated by the same letter are, 
of course, different for the different types of curves. If we let a=log A, 
b=log B, and C=A/k, the similarities and differences among these 
three types of curves are apparent. 


Modified exponential............. Y. — k = ab*; 

ES ical denies ane aken . Log Y,. — logk = ab*; 

we I I , 

bind rinks waeededwhs ees — — — = CB*. 
ie 


For the logistic curve, the second form of equation is derived from the 
first as follows: 


k ] 1 stele ] 


1440+" Yk k 


AB 
= —— = CBX. 


l 
Y, k I: 





To fit a modified exponential by the method of selected points it is 
desirable first to plot the data on arithmetic and/or semi-logarithmic 
paper, and by inspection draw a tentative trend line resembling a 
modified exponential. Next, select three values, yo, ¥:, Y2, on the trend 
line spaced apart at n intervals of time, so that N —1=2n, where N is 
the number of observations. The constants for the trend equation may 
then be obtained as follows: 

Y2—- Yi Yi — Yo 
bs = ———— ° gz = 
Yi — Yo y= 
585 
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For the Gompertz, logarithms of the y values are used, while for the 
logistic, reciprocals are used. 
It may be possible to obtain a better fit for the modified exponential 
by the following graphic method. 
1. Obtain a trial value of k by the expression 
, — Yee — wi? 
Yo + y2 — 2m 





2. Compute values of Y—k and plot on semi-logarithmic paper. If 
the tentative trend represents a decreasing amount of annual increase 
or an increasing amount of annual decrease, the values of Y —k will be 
negative, and the sign of the scale values on the semi-logarithmic paper 
should be changed from positive to negative. 

3. If the values of Y—k are positive, and the apparent trend of 
the Y —k values is concave upward on the semi-logarithmic paper try a 
larger value of k; if it is concave downward, try a smaller value of k. 
If the values of Y—k are negative the direction of change in the trial 
value of k is reversed. 

4. Continue subtracting trial values of k until an apparently straight 
line trend results. 

5. Fit an exponential curve to the Y—k values by any appropriate 
method. Sufficiently accurate results will usually be obtained if a 
straight line is fitted by inspection to the plotted values. If this method 
is used, read the first and last values from the fitted line. The first 
value is a. The ratio of the last value to the first value is b¥—. 

6. Add k to each of the values of ab* obtained in step 5. These are 
the modified exponential trend values. 

‘the modifications of these methods necessary for fitting a Gom- 
pertz curve are rather obvious. The fitting of a logistic by the method of 
selected points will now be illustrated for United States population, by 
decades, 1790-1940. The data are shown in the accompanying table. 
For purposes of illustration, take as the three selected values: yo =3.9; 
Yi = 31.5; y2= 122.5. The values for the constants are easily computed. 





1 1 1 
— = .2564; — = .03175; — = ,008163. 
Yo Y1 Y2 
1 1 
Y2 Y1 .008163 — .03175 — .023587 
B* = Bi = = = —————. = ,10499. 
1 1 .03175 — .2564 — .22465 


Yo 
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b = log B = — .13984 = 1.86016; B = .72471. 
1 1 
— .22465 
C = S a = = .2510. 
B* — 1 — .89501 
l 1 
—=— —C = .2564 — .2510 = .0054. 
k Yo 


The trend equation is therefore 


1 
y 0054 = (.2510) (.72471)*. 

Fitting a logistic curve by a graphic method! will now be illustrated, 
using the same data. The reciprocals of the Y values are shown in col- 
umn 4 of the accompanying table, and are plotted on the accompanying 
chart, which has a logarithmic vertical scale. Since the apparent trend 
is concave upward it is apparent that the value of 1/k is positive. 
Using the same values of yo, y1, and y2 as before we compute a tenta- 
tive estimate of 1/k. 

l 1 1 


1 yo ys yt? (.2564)(.008163) — (.03175)? 
k 1 1 2 2564 + .008163 — 2(.03175) 





= .0054. 





Yo Y2 Yi 


If this value of 1/k be subtracted from the different 1/Y values, and 
these differences plotted on semi-iogarithmic paper, the apparent trend 
will be found to be slightly concave downward. Therefore a slightly 
smaller value of 1/k is required. Column 5 of the table shows values of 
1/Y —1/k where 1/k=.0052. These are plotted on the chart. Since the 
apparent trend is linear a straight line was fitted by inspection. These 
are values of 1/Y,—1/k=CB*. The first and last values from this line 
were read from the chart. CB®°=.2490, and CB*-'=.00225. The value 
for C is therefore .2490; BY-'=B%=.00225/.2490 =.009036, and 
B=.73068. The values recorded in column 6 are values of 


1 A different graphic method is described by Raymond Pearl in his Introduction to Medical Biom- 
etry and Statistics, Third Edition, Chapter XVIII, W. B. Saunders Company, Philadelphia, 1940. The 
concept used by Pearl is that log (k - Y/Y) =a+bX, while the concept used in this paper is that log 
(1/¥.) —(1/k) =log C+bX. The method described in this paper is less laborious, provided the lower 
asymptote is zero; if the lower asymptote must be estimated by graphic methods, Pearl's method will 
be found to be less laborious. The methods described in the present paper are also more general in that 
they are e>plicable to each of the three types of curves indicated 
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1 1 
= 5 = (-2490)(.73068)*. 





For convenience of computing trend values the trend equation may be 
written 


l ] 
Log (— = -) = log C+ dX, or 


a 


c c 


1 
Log (— — 0052: = — .60380 — .13627X. 


iA 


NU 
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GRAPHIC APPROXIMATION OF VALUES OF 1/k AND CB* FOR FITTING 
LOGISTIC TO U. S. POPULATION, 1790-1940. 


In column 7 we have added .0052 to each of the CB* values of column 
6, while the logistic trend values of column 8 are the reciprocals of the 
values of column 7. If preferred, the equation can be put in the form 
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k 
1 + 10°+'x , 


a 


ec 


Since 1/k = .0052, k = 192.3. The estimate, by this method, of the maxi- 
mum population is, therefore, 192.3 million. 


A = Ck = (.2490)(192.3) = 47.88. 
a = log A = log 47.88 = 1.68006. 


The trend equation is 
192.3 
l a 10)! -68006—.13627X 





An advantage of this, and other similar, graphic methods of fitting, is 
that visual evidence is afforded of whether, and over what span of time, 


LOGISTIC TREND FITTED BY GRAPHIC METHOD TO UNITED STATES 
POPULATION, 1790-1940 




















| | 
| } 1 1 | Exponential 1 
| at —-— | trend values _ 
pula Y k "mX . 

| tion in 1 | cB Ye 
oet @:- iS i = | Y. 

millions | Y ; | : ; : 

7, (4-0 (3-1) (; +a") 

| | | /| : 
Mm | @ | @) | (4) | 6 | @ | @m | ®& 
1790 | oO | 3.929 2545 .2493 | .2490 | .2542 | 3.934 
1800 | 1 5.308 .1883 -1831 1819 | .1871 5.345 
1810 2 | 7.240 .1381 .1329 1329 | .1381 7.241 
1820 3 9.638 .1038 0986 .0971 | .1023 | 9.775 
1830 4 | 12.87 | .0777 .0725 .0710 | .0762 | 13.13 
1840 5 | 17.07 0586 .0534 .0519 | 057 | 17.52 
1850 6 | 23.19 .0431 .037 0379 | = .0431 23.20 
1860 7 31.44 .0318 .0266 .0277 0329 | 30.40 
1870 s 39.82 .0251 0199 .0202 .0254 | 39.37 
1880 9 50.16 .0199 .0147 0148 =| .0200 | 50.90 
1890 10 | 62.95 0159 .0107 .0108 | .0160 62.50 
1900 11 | 75.99 | .0132 -0080 .00789 | .01309 76.39 
1910 12 | 91.97 | .0109 | .0057 .0057 .01097 | 91.16 
1920 13 | 105.7 | .0095 .0043 00421 .00941 106.2 
1930 | 14 | 122.8 } .0081 .0029 .00308 .00828 120.8 
1940 | 15 | 131. .0076 .0024 00225 00745 | 134.2 











Source of data: Bureau of the Census, U. S. Department of Commerce, Statistical Abstract of the 
United States, 1946, pp. 6-7. 


a curve of a particular type is applicable. In case a value of 1/k does 
not exist that produces an exponential trend for values of 1/Y —1/k, 
one can perhaps fit a trend of some other type, such as a logarithmic 


590 AMERICAN STATISTICAL ASSOCIATION 


parabola, to these values. Another possible advantage, which is ap- 
plicable only to the logistic, is that the exponential trend fitted to the 
1/Y —1/k values is more sensitive to observations in recent years than 
to those in early years. Note that for 1790 it is very difficult to distin- 
guish between 1/Y—1/k, and 1/Y.—1/k values, but that the distine. 
tion becomes progressively easier as the points of time become more 
recent. 
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LORENZ CURVE ANALYSIS OF INDUSTRIAL 
DECENTRALIZATION 


Georce C. Smirua, Jr. 
Regional Economist, U. S. Department of Commerce 


A study of decentralization (made by calculating the wage 
earner per square mile ratio of all cities and counties in the 
United States for four census years, ranking the units in 
order of density, and cumulating their percentages of the total 
area and wage earners on a Lorenz curve) indicates that in- 
dustry in the United States has been gradually decentralizing 
since 1899. 


NDUSTRIAL decentralization has been the subject of considerable 
| discussion during the past thirty years, but the statistical evidence 
which might throw some light on actual trends toward the concentra- 
tion or dispersion of industry has scarcely been examined. The only im- 
portant factual studies are Tracy Thompson’s “Location of Manufac- 
tures, 1899-1929,” published in 1932, Daniel Creamer’s ‘‘Is Industry 
Decentralizing?” (1935) and “Changes in Distribution of Manufactur- 
ing Wage Earners, 1899-1939” (1942) by Harold Kube and Ralph 
Danhof. Of the three, only Creamer’s study seeks to present a mathe- 
matical measure of decentralization; and Creamer’s classification suf- 
fers from the fact that it is somewhat illogical and possessed of a pro- 
nounced bias toward concentration. 

The first and most pressing need in the measurement of decentrali- 
zation is some sort of index of industrial concentration. Thompson and 
Creamer used heterogeneous and somewhat haphazard systems of 
classification in which some of the classes were based on population, 
others on number of wage earners, political boundaries, the subjective 
choice of a Census committee, and in one case, on all four at once. 
There was undoubtedly some correlation between the ranking of these 
classes and actual industrial density, but it is questionable whether it 
could have been great. 

It would seem desirable to measure industrial concentration directly 
as a basis for an index. Such concentration implies a ratio of some sort; 
so many units of industrial activity per unit of some standard meas- 
ure. The unit of industrial activity may well be the wage earner; a unit 
which, while not perfect, at least avoids the problem of dollar values. 
For the denominator, area or population might be used; and since 
decentralization is usually thought of as an areal phenomenon, the 
best measure is probably wage earners per square mile. 
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It cannot be claimed that the use of a square mile basis is entirely 
satisfactory, because the political areas for which data are published 
vary greatly in size and shape. It is desirable, wherever possible, to 
use the smallest political units for which data are available. The case 
of Los Angeles County illustrates the point; a large city, with a con- 
centration of manufacturing, located in an extremely large county 
which is for the most part rural. In such cases, it is possible to remove 
the city from the county data, and treat the county remainder as a 
separate unit. The result replaces a heterogeneous unit with two units 
which are considerably more homogeneous. This method is not feasible 
in cases where city data are not available, but it has been used in this 
study wherever possible. 

The only great difficulty in arranging the scale of density for the 
United States lies in the necessity of performing some 12,000 separate 
calculations. In the absence of a clerical staff, it has been necessary 
to reduce the number of computations by lumping together all counties 
with a density of less than 10 wage earners per square mile. This lower 
limit, while arbitrary, was chosen because any county with such a low 
ratio can unquestionably be considered non-industrial, and because 
the use of the number 10 made it a simple matter to determine, by 
inspection, whether any given county fell above or below the limit. 
If the number of wage earners in a particular county exceeded the 
area figure with the decimal peint moved one place to the right, the 
data was entered on a county card and the ratios calculated; if not, 
the county was left with the residual non-industrial items. 

Cities were treated similarly. Because the Census disclosure rule 
makes the data from smaller cities extremely spotty, it was determined, 
once again arbitrarily, to set 5,000 wage earners as the minimum for 
separate consideration of a city. While this method incorporates some 
of the disadvantages of Creamer’s study, it is an improvement in that 
wage earners, rather than population, are the limiting factor. Naturally, 
once a city had been singled out, comparability of the units demanded 
that it be given separate treatment in each Census year used, whether 
it met the minimum requirement in all the years or not; otherwise, 
many cities which experienced rapid growth after 1899 would have 
been excluded from separate consideration. Where cities were sep- 
arately considered, the wage earner and area figures were separated 
from those of the county, leaving a “county remainder” which was 
treated in the same manner as a whole county. In some counties, two 
or more cities met the minimum requirement, and were removed, and 
in one county, where seven eligible cities made up more than seventy- 
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five per cent of the county area, the whole county was treated as a 
city unit. 

Each unit was given a code number, indicating the Census region, 
the state, the county name, and whether the unit was the whole 
county, a city, or a county remainder. It was then quite simple to 
perform the calculations, and to group the units by types or to rank 
them in several ways. 

Four Census years were included in the study: 1899, 1919, 1929, and 
1939. These were the years in which the most complete Censuses of 
Manufactures were compiled. It was necessary to omit 1909, since no 
county data were published for that Census, and the county cards on 
which original data were kept have been destroyed. Before 1899, non- 
comparability makes the data valueless, and even the Census of 1899 
is questionable. It was included partly because otherwise it would 
have been necessary to move the beginning date of the study up twenty 
years to 1919. 

While several analyses of the data were made, the Lorenz curve 
analysis gave the most interesting results. Although this curve was 
primarily intended for use with frequency distributions, the analogous 
use here was immediately apparent. While the Lorenz curve cus- 
tomarily appears on a square chart, with the diagonal “indifference 
line” running from the origin to the point (100%, 100%), certain modi- 
fications were made for this study. Edgar M. Hoover used the curve in 
an article in the Review of Economic Statistics to express the ratio of 
workers to population for certain specific industries.' Taking the ratio 
of per cent of total wage earners to per cent of total population for 
selected cities and state remainders, Hoover arrived at a series of 
slopes, measured by the tangent of the angle of elevation. Arranging 
these units in order of slope, and cumulating the percentages, Hoover 
obtained a Lorenz curve. 

In the present study, there was no need to use this complex method. 
It is possible to make up the curve without the use of percentages, by 
ranking the units in order of wage earner density, plotting the cumula- 
tive totals directly on a chart drawn with the scales of cumulative 
data on each side of the square. 

By plotting wage earners along the y-axis and area along the z-axis, 
a Lorenz curve may be drawn which permits direct conclusions as to 
concentration or decentralization. If the wage earner density of all 
areal units was the same, or in other words, if all. industry as repre- 


1 Vol. 18, pp. 162-71. (1936). Hoover used the “operatives” figure from the Census of Occupations 
for his numerator, as being more comparable with the denominator in this case. 
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sented by wage earners was spread out evenly over the nation, what 
might be called “complete dencentralization” would obtain, and the 
Lorenz curve would be a straight line lying along the diagonal. It js 
obvious that ne more complete decentralization than this could be 
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FIG. 1. CUMULATIVE PERCENTAGE (LORENZ) CURVES OF WAGE EARNERS 


AND AREA, UNITED STATES AND FOUR REGIONS, 1939 


(Area scale expanded 25 times) 


possible. The departure of the actual curve from the diagonal indi- 
cates, then, the degree of concentration of the actual wage earners. 
Figure 1 shows the degree of wage earner concentration in the United 
States and in the four most important regions industrially, for 1939. 
In order to show the differences more clearly, the chart is no longer a 


square, but has been expanded along the z-axis. As would be expected, 
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the chart shows a large degree of centralization in all regions; and other 
features of the curves also follow common beliefs. The South Atlantic 
region, always reputed to be more decentralized than the others, is 
definitely shown to be so. The New England region, noted for nu- 
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FIG. 2. CUMULATIVE PERCENTAGE (LORENZ) CURVES OF WAGE EARNERS 
AND AREA, UNITED STATES, 1899, 1919, 1929, AND 1939 


(Area scale expanded 50 times) 


merous medium-sized manufacturing centers, is also less centralized 
than the two remaining regions. The East North Central and Middle 
Atlantic regions, characterized by tremendous concentrations, are 
naturally the most centralized, and their curves show the greatest 
departure from the diagonal. The United States curve lies fairly close 
to those for these last two regions, since they contain more than half 
the nation’s wage earners. 
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The fact that the lines of Figure 1 coincide fairly well with common 
knowledge is important mainly as a test of the method. Once estab- 
lished as a useful measure, the curve may then be applied to a subject 
which is not common knowledge, but about which there is much specu- 
lation, and that is the trend of centralization or decentralization over 
the forty-year period. Figure 2 shows the results of the analysis of the 
United States figures for the four census years studied. The progression 
of the lines in Figure 2 toward the diagonal is definite evidence of 
decentralization for the country as a whole, in the sense that the areas 
with lower wage earners densities have gained relative to the areas 
with higher densities. It should be emphasized, however, that the 
change has been slight, and that the z-axis of the chart has been 
greatly expanded in order to make the movement more apparent to 
the eye. Most important, however, is the fact that the movement, 
based on a large number of individual units, is steady in the direction 
of decentralization. 

Although they are not reproduced here, the curves for the four re- 
gions show the same trend, except in New England, where a net con- 
centration was registered for the whole period, with the greatest con- 
centration between 1919 and 1929. The South Atlantic region shows 
the greatest decentralization, while the Middle Atlantic and East 
North Central show a steady, but less pronounced, decentralization. 
Once again the movement is small, but it is a strikingly uniform 
progression towards the diagonal of complete decentralization. 
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AGRICULTURAL PRICE INDEX NUMBERS* 


Artuur G. PETERSON 
Office of the Secretary of Defense 


HESITATED to participate in this program because my work has been 
| quite apart from price indexes since 1944, when I transferred from 
the Department of Agriculture to the War Department. However, I 
have enjoyed refreshing my memory about agricultural index numbers 
because for several years I lived and struggled with them. 


GENERAL CONSIDERATIONS 


First I want to make a few general comments. Then I shall discuss 
the three major price indexes of the Department of Agriculture and 
make some suggestions for their improvement. These three major 
indexes relate to prices received by farmers, prices paid by farmers, 
and parity prices. 

Significant developments in science and technology and changes in 
market conditions make it necessary for us to reexamine and revise 
our price indexes from time to time. The use of some indexes in the 
administration of agricultural programs in recent years makes it all the 
more important that they be representative of the present, or at least 
of recent conditions. An increase or decrease in the parity-price index 
requires a corresponding change in the parity prices computed by the 
Department of Agriculture. Reductions in current parity prices, in 
turn, indicate a reduction in the government subsidies which are based 
on a percentage of the parity price of a particular farm product. 

in the late 1930’s, after several years’ work and an expenditure of 
over $100,000, the findings conclusively indicated some reduction in the 
recent level of the parity-price index. Although this index had become 
enmeshed in politics, I continued until early 1944 to work for its 
revision and improvement. 

Since the mid-1920’s there has been considerable discussion regard- 
ing the fixed-weight, aggregative type of index numbers adhered to by 
the Bureau of Agricultural Economics. However, it has not been prac- 
ticable, even if desirable, to obtain and use given year weights. The 
weight periods have been moved forward about once a decade, but the 
practice has been to apply the more recent weights all the way back to 
1910 or earlier. This procedure, although expedient, may no longer be 
justified. The great increase in the number of tractors since 1910, for 


* A paper presented to the American Statistical Association, Atlantic City, 26 January 1947. 


597 








598 AMERICAN STATISTICAL ASSOCIATION 


instance, makes any recent weight excessive for previous decades, 
Therefore, consideration should be given to using recent weights for 
the last decade only. Separate indexes constructed for each decade with 
contemporaneous weights then could be linked to form one series. 

The adherence to a base period of a generation or more ago is un- 
fortunate in many respects. A recent base is preferable for any forward- 
looking agricultural program. In mid-1941, when a Senate Committee 
undertook to revise the parity formula, I urged the adoption of a recent 
moving average as a base for agricultural price indexes. Our British 
friends had recently adopted such a base for their index of farm prod- 
uct prices. 

The most recent 10 calendar years are recommended as a moving and 
continuing base for agricultural index numbers. A 10-year period with 
a shift forward at the end of each year would provide adequate stability 
along with self-adjustment. It would tend to provide flexibility and 
encourage shifts in production in keeping with changes in demand. Inci- 
dentally, the importance of these factors in long-range agricultural 
programs was pointed out by the President in his recent Economic Re- 
port to the Congress.' Adherence to an historical base, on the contrary, 
establishes vested interests and leads to the political creation of 
monopoly conditions in some cases. Moreover, it permits the dead hand 
of the past to interfere with economic readjustments so essential to 
progress. Even the lawyers—wedded as they are to precedents—long 
ago recognized the evils of the dead hand. That is why we have the 
general rule against perpetuities, in other than charitable trusts. 


PRICES RECEIVED 


Index numbers of prices received by farmers since 1909 were com- 
pletely revised three years ago.? The four major changes at that time 
were (1) Several commodities were added. (2) Improved price series 
were used in a number of cases. (3) More useful commodity groupings 
and subindexes were developed. (4) Quantity weights were shifted for- 
ward a decade to the period 1935-39. 

In view of these extensive and relatively recent revisions, there is no 
immediate need for significant changes in this series by itself. Any gen- 
eral shift to a recent base period for agricultural indexes, of course, 
would call for a similar change in the index of prices received. In the 
1944 revision all aggregates for each date were totalled instead of ap- 

1 The Economic Report of the President, 8 January 1947, p. 26. 


2 Index Numbers of Prices Received by Farmers, 1910-1943, U. S. Department of Agriculture, Feb- 
ruary 1944, 36 pages, by Arthur G. Peterson et al. 
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plying percentage weights to the subindexes as in the index of prices 
paid by farmers and in the parity index. Totalling the aggregates ar- 
rives at the same result, but it is a laborious procedure ana complicates 
the introduction of new items. A return to the simpler and more con- 
venient method should accompany the next revision. 

The Department of Agriculture has index numbers of prices re- 
ceived by farmers, extending back on an annual basis to 1851, that 
have not been published. This historical series, although it may be 
improved from time to time, is not likely to be changed materially. It 
should be published te supersede and correct certain errors in the series 
running back to 1869, which was issued in 1940.* 


PRICES PAID 


Before I discuss some of the desirable revisions in the index of prices 
paid by farmers for commodities, I want to review the record briefly. 
The revisions that have been attempted in the last decade—although 
they have not been adopted officially—for the most part, indicate the 
changes that seem to be needed. 

The original index numbers of prices paid by farmers were developed 
by Clarence M. Purves and published by the B.A.E. in 1928, with 
minor modifications in 1933-34. This index-number series stili is the 
official series of the Department of Agriculture. Although various 
people have worked on these indexes in the last 25 years, the work has 
all been carried on under the guiding hand of Dr. O. C. Stine of the 
B.A.E. 

The first major revision of index numbers oi prices paid by farmers 
was undertaken in 1936 as part of a study on income parity for agri- 
culture. The revised series, published in a preliminary report in May 
1939, indicated a reduction in current parity prices of 5} per cent from 
the 1909-14 base and about one per cent from the 1919-29 base. Some 
agricultural interests, and one farm organization in particular, turned 
their big guns on the Department of Agriculture to head off such a 
reduction in parity. In a letter to the Assistant Secretary of Agriculture, 
it was zontended that substitution of the revised series, regardless of 


?U. 8. Department of Agriculture, Technical Bulletin No. 703, December 1940, by Frederick Strauss 
and L. H. Bean. 

4U.S.D.A., Income Parity of Agriculture, Part III, Section &, Index Numbers of Prices Paid by 
Farmers for Commodities, 1910-1938, 55 pages, May 1939, prepared by Arthur G Peterson et al. About 
20 commodities were added, including a group of livestock items, and data for about 70 items previously 
added for the most part in the mid-1920’s were extended back to 1910. The total number of commodi- 
ties in 1910, exclusive of duplicates, was increased from 67 in the old series to 156. The commodities 
were reclassified and four new subindexes were developed. Weights for commodities used for living were 
shifted from 1924-1929 estimates to 1935-1936 data based on the Consumer Purchases Study. 
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its merit, was “inopportune” and “highly inadvisable.” An attack was 
also made on the use of the more recent weights based on the Con- 
sumer Purchases Study. Suffice it to say that the index numbers as 
revised in 1939 were not officially adopted.® 

After some modifications another unsuccessful attempt was made to 
substitute an improved index. This occurred in connection with the 
1941 proposal of the Senate Committee on Agriculture and Forestry 
to revise the parity formula. At that time more recent weights, 
(1935-39) were substituted for commodities used in production. More- 
over, the addition of automobiles, tractors, motor fuel and tires was 
arbitrarily shifted from 1917 to 1924—a minority decision by the way. 
The latter change, although it boosted the revised index about 3 per 
cent, nevertheless indicated some reduction in current parity prices. 
Because of this, and other difficulties, the 1941 revisions, although pub- 
lished, were never adopted.® 

In mid-1943 the revised parity index advanced to the same level as 
the official index, owing largely to the rise in livestock prices which 
were in the new series, but not in the old..The stage was all set to sub- 
stitute the new series in August 1943, but the final computation indi- 
cated that the new series for that month was two points below the 
official series. Rather than to lower the parity level more than one 
point at that time, it was decided to defer the substitution of the new 
series.’ The new and old parity indexes were again at the same level a 
few months later, but by that time the government policy was to make 
no change. 

The index of prices paid by farmers, as now constructed, tends to 
have an upward bias and to overstate the level of the parity concept 
for two reasons: 

1. The marked increases in performance and durability since 1910-14 
of many manufactured items such as automobiles and tires are not re- 
flected in the price series now used. Along with this observation it 
should be noted that no satisfactory method has been found for deal- 
ing with this problem. 

2. Prices of new commodities,—synthetic fibers and vitamins, for 
instance—tend to be relatively high for several years. As the volume of 
output increases and per unit costs decrease the price usually declines 

§ One of the improved sub-group indexes that was published separately in May 1939 was later sub- 
stituted “inadvertently” for the old index, viz., for Farm Machinery other than Motor Vehicles. See 
U.S.D.A. Income Parity for Agriculture, Part III, Section 4, Prices Paid by Farmers for Farm Machinery 
and Motor Vehicles, 1910-1938, 24 pages, May 1939, by Arthur G. Peterson. 

6U.S.D.A. Materiale Bearing on Parity Prices, mimeographed, July 1941. 


7 Later it developed that the August 1943 difference was only one point, the computer having 
rounded estimates of 1.5 to 2 instead of to 1. 
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considerably and then tends to level off. The practice of waiting until 
a new item has become fairly “stable in price” and “significant in 
volume of sales,” before adding it to the index at its current level, 
tends to eliminate the effect of the typical price decline in the early 
development of such items. 

A more adequate index of prices paid by farmers is needed. Specific 
changes are recommended as follows: 

1. Base period. Use of the last ten year moving average to provide 
a self-adjusting base as suggested previously. The Congress as early as 
1937 established a precedent for an adjustable base period when it 
authorized the Secretary of Agriculture to establish milk prices that 
need not conform to the parity price. This departure from parity 
prices to provide administrative discretion in price programs for indi- 
vidual commodities may point the way to better price programs in the 
future. This hope is based on a presumption that such administrative 
discretion will operate for the general good and not to the special ad- 
vantage of minority pressure groups. 

2. Weight period. Shift to the 1935-39 weights (1935-36 for living 
items), at least for the period since 1930. Moreover, adequate prepara- 
tion should be made for the collection and use of weights for the 
decade of the 1940’s. 

3. Coverage. Additional price data collected since 1936 should be in- 
corporated in the indexes such as: 

a. Additional commodities. 

(1) A number of important commodity price series not in the index 
that are available from 1910 to date, including feeder cattle and other 
livestock purchased by farmers. Incidentally, the addition of livestock 
items would add flexibility to the index. 

(2) Some additional price series for the last decade that have been 
collected for food and clothing. 

(3) Some items such as baby chicks that have become increasingly 
important in farm expenditures in recent years. 

b. Extension of price series back to 1910. Prices series for about 75 
commodities that were added for the most part in the 1920’s have been 
extended back to 1910. The inclusion of prices for the earlier years 
would improve not only the general index, but especially some of the 
subindexes. 

c. Service rates. The inclusion of rates paid by farmers for such 
services as electricity, medical care, telephones, and movies is a long 
recognized need. Services constitute about 15 per cent of farmers’ ex- 
penditures for living. The addition of service rates to prices of com- 
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modities used for family living would provide a more comprehensive 
coverage and a better indicator of changes in rural living costs. It 
would make this B.A.E. index more comparable with the B.L.S. index 
of urban retail prices of consumers goods which includes service 
rates. This would help to avoid some misleading comparisons between 
the B.A.E. and the B.L.S. series such as were made in World War II. 
The addition of service rates to the general index of prices paid would 
add stability to that index; making it a little lower in periods of high 
commcdity prices and a little higher in depression periods such as 1932.8 
Rates paid by farmers for various services are available for selected 
periods from 1910 to 1936.° These series could be brought up to date 
and collected on an annual basis at relatively small cost. 

4. Classification. Regrouping of items, along the lines proposed in 
1939 and later years, would increase the usefulness of the subindexes. 

5. Regional index numbers. The development of regional indexes of 
prices paid by farmers should be undertaken from 1924 to date. Prior 
to that date the price sample seems inadequate for this purpose. 

Before leaving the index of prices paid let me say that its revision 
seems long overdue. The wartime agreement to support postwar prices 
of many farm products at 90 per cent of the parity price will terminate 
at the end of 1948. Therefore, revisions of the prices paid and parity 
indexes should be undertaken and published in ample time for official 
adoption by January 1949. 


PARITY INDEX 


In proposing certain revisions in the parity index I am not unmindful 
of the difficult political problems that some changes involve. The gen- 
eral nature of the parity-price formula and its use, in the final analysis, 
is a matter for the Congress to determine. Several modifications have 
been made since the parity-price concept was adopted in the Agricul- 
tural Adjustment Act of 1933. These changes have been of three types: 

1. Changing the composition of the parity index. In August 1935, 
interest and tax rates per acre were added in computing parity prices 
from the 1909-14 base which raised such parity-prices about 4 per 
cent.!° Interest and tax rates were not included in computing parity 
from base periods subsequent to 1918 because to do so would have 


®U.8.D.A., Income Parity for Agriculture, Part III, Section &, Supra., pp. 26-28. 

* Ibid., Section 1, Medical Service Rates to Farmers, August 1938; Section 2, Rates for Electricity for 
Farm Home and Farm Power, September 1938; Section 3, Telephone Rates to Farmers in the United 
States, December 1938. 

10 Bureau of Agricultural Economics, Mimeographed report on Index Number of Prices, Taxes and 
Interest Payable by Farmers, by Arthur G. Peterson, August 1935. 
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lowered such parity prices. Oddly enough, the effect of interest rates 
in recent years, and of tax rates in recent months, has been to lower 
parity prices. 

Farm wage rates were omitted deliberately in 1935 because their 
inclusion would have lowered parity at that time." The subsequent 
sharp relative rise in farm wage rates has resulted in much pressure to 
add them to the parity formula. The Pace Bill to accomplish this was 
reintroduced in the House of Representatives again early this month 
(January 1947).” 

2. A second type of legislative change has been to shift or modify 
base periods for some commodities. All or part of the 1919-29 period 
has become the base for some fruits and many vegetables. The base 
period for burley and flue-cured tobacco was shifted to August 1934— 
July 1939 to increase their parity prices substantially.“ The provisions 
for “comparable prices” in certain circumstances, and exceptions in the 
case of milk, were equivalent to the creation of special parity bases.” 

3. A third type of change has been to vary the percentage of parity 
for purposes of loans, minimum purchase and resale prices, and price 
guarantees. Parity for many years seemed a satisfactory objective for 
farmers. But when this goal was reached—and even before it was 
reached for some farm products—a more favorable objective was 
sought and obtained. In the Emergency Price Control Act of January 
30, 1942, the Congress “rediscovered” and restated “equality” for agri- 
culture. Ceiling prices for any farm product could not be set at less 
than i110 per cent of its parity price. The price control act had other 
restrictions on the establishment of ceiling prices for farm products, 
which made the average of such prices equivalent to 115 per cent of 
parity. 

President Roosevelt was opposed to the provision for 110 per cent of 

1 Alater amendment specified that the parity index with a 1909-1914 baseshould also reflect changes 
in freight rates on farm products. Senstor Borah, who sponsored this amendment, was asked by a Senate 
colleague what the effect would be. He said he did not know, but knew that freight rates were much 
higher than in 1909-1914. Apparently the Dept. of Agriculture was not consulted at that time. Later, 
I was asked by the Solicitor what should be done about this amendment. I indicated that in my opinion, 
freight rates already were reflected on both sides of the parity-price equation—both in prices received 
and paid by farmers. This interpretation was adopted and adhered to. Apparently the first official 
statement of this position was released in a letter to Mr. A. H. Garside of the New York Cotton Ex- 
change, 3 May 1938. This letter was prepared by the author for Ralph Mershon of the Agriculture 
Adjustment Administration. 

H.R. 135, 80th Congress, 1st Session. 

Bureau of Agricultural Economics, The Tobacco Situation, September 1941, Cover page chart 
(Neg. 39354). 

“S. 103, 80th Congress, Ist Session, proposes to legislate a comparable price for wool (and for 
lambs) one-fourth above the parity price. 


'U.S.D.A., B.A.E., “Parity” Rediscovered. An address by Howard R. Tolley, Chief, B.A.E.— 
August 26, 1942, pp. 5-6. 
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parity. In a message of the Congress, April 27, 1942, he stated in part: 


In fairness to the American people as a whole, and adhering to the purpose of 
keeping the cost of living from going up, I ask that this formula be corrected, 
and that the original and excellent objective of cbtaining parity for the farmers 
of the United States be restored. 


The President reiterated the above request in his “inflation message” 
to the Congress on September 7, 1942. 

Although the parity formula has been modified several times, parity 
prices at any given time are determined in whole, or in large part, by 
the course of the index of prices paid by farmers for commodities. The 
latter index is the parity index for all farm products for which the base 
period is subsequent to 1918. Moreover, in the parity index with the 
1910-14 base, the index of prices paid for commedities constitutes about 
86 per cent of the total. Inasmuch as the index of prices paid already 
has been discussed, there remains only to say a few words about other 
components of, and possible additions to, the parity formula. 

The addition of interest and tax rates per acre to the parity index was 
a political expedient to raise parity prices at that time. The same would 
be true if farm wage rates were added at their present relatively high 
level, and especially if—as often proposed—they were given a weight 
to represent the imputed value of farm family labor as well as the ex- 
penditure for hired farm labor.'* From an economic or statistical stand- 
point the inclusion of rent, insurance, income taxes and some other 
items would seem to be as logical as the addition of interest and tax 
rates on farm real estate. If all expenditure items were included in the 
parity index, farmers might raise their parity prices by increasing their 
spending. 

Instead of attempting to summarize this brief discussion let me con- 
clude with this comment. Abuse of the parity concept should be care- 
fully avoided so as not to deprive agriculture of many advantages 
gained since 1933. Costly subsidies and the creation of artificially high 
prices for food and fiber might arouse the public and lead to abandon- 
ment of related price-support programs. 

1% The O'Mahoney amendment to the price control bill, which weuld have raised parity prices 
markedly by adding industrial wage rates to the parity formula, was passed by both the House and 
Senate in January 1942. Although backed by eleven national and regional farm organizations, the inter- 


vention of the President resulted in the elimination by House and Senate conferees of this inflationary 
proposal. 
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ON THE CHOICE OF FORECASTING FORMULAS 


Pau G. Horn 
University of California at Los Angeles 


A significance test that possesses certain optimum proper- 
ties is derived for testing the hypothesis that the means of n 
independently normally distributed variables with a common 
variance are given by one of two forecasting formulas rather 
than by the other. 


1. INTRODUCTION 


whether a new forecasting formula is superior to one in current use. 
For example, a meteorologist might devise a new formula for predict- 
ing the atmospheric pressure at weather stations and would like to 
know if it is actually superior to a standard formula before he advocates 
its use. Or, an economist might desire to test the conflicting claims of 
two groups of price forecasters. 

In attempting to compare two forecasting formulas, an experi- 
menter will usually apply them to the same set of data. If the new 
formula yields only a slightly better record according to some criterion, 
the experimenter will not be certain whether the new method is 
actually superior or whether the slight improvement could reasonably 
be attributed to sampling variation. Thus, he will usually insist that 
the difference be shown to be significant before he will consider adopt- 
ing the new formula. In this paper, a significance test is derived for 
treating a certain class of such problems. 


—. are frequently confronted with the problem of deciding 


2. ASSUMPTIONS 


Let the variable to be predicted be denoted by z and let 2, 72, - - -, Zn 
denote the observed values of x under n selected sets of circumstances. 
The circumstances are determined by the values of the variables that 
are being used in the forecasting formulas. For example, in predicting 
atmospheric pressure at a station, the circumstances might include the 
pressures at neighboring stations for several preceding days, as well as 
other variables. By a forecasting formula is understood an ordinary 
algebraic formula, or a set of tables, that expresses z in terms of the 
chosen variables. Such a formula will ordinarily have been obtained 
from a combination of theoretical considerations and past observa- 
tions. Since the formula is completely determined before the observa- 
tions 21, - - - , 2, are taken, it is a strict forecasting formula rather than 
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a fitted formula such as is often used in regression prediction and in 
which certain parameters are determined from the observations. 

It will be assumed that if repeated sets of observations are made 
with the same set of circumstances as for the first set, the z; may be 
treated as a set of n independent normally distributed variables with 
means m; and a common variance o*. This assumption is similar to the 
assumption made in regression analysis but in which the m, are ex- 
pressed linearly in terms of the fixed variables for linear regression, or 
linearly in terms of simple functions of the fixed variables for curvi- 
linear regression. The variavle we are interested in predicting, such as 
atmospheric pressure, must therefore not vary appreciably more for 
one set of circumstances than for another. For example, the variability 
in pressure should not increase with an increase in the mean pressure. 
The assumption of a common variance may be unrealistic in many 
situations. The fluctuation in certain prices about a mean value, for 
example, would be expected to be much greater under some sets of 
circumstances than under others. 

Let u; denote the predicted value of z; by the old formula and »; 
by the new formula. Furthermore, let 

Ho: mi = Uy 


(¢ =1,2,---,n) 
Ai: m = 9; 


denote the two alternative hypotheses that the true means of the z’s 
are given by the old and the new formula, respectively. The problem 
then is to devise a method for testing the hypothesis H» against the 
alternative H,. It should be noted that u,; and »; are fixed values be- 
cause they were obtained from their respective formulas and the inde- 
pendent variables in the formulas are held fixed in repeated sampling 
experiments. Geometrically, u and v would denote the z coordinate of 
the two surfaces of regression corresponding to the two formulas, in 
which the independent variables are the variables used in the forecast- 
ing formulas. 


3. THEORY OF BEST TESTS 


For the purpose of designing a test, it is necessary to have the joint 
distribution function of the variables 21, 22, - - - , 2, when Hp is true. 
Because of the assumptions made in the preceding section, this dis- 
tribution may be written as 

e7 Zi (zi—ua)?/20* 


(1 Po(m1,--+, 2.) = ————_ - 
die (27)"/2q" 
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The corresponding distribution when H, is true will be denoted by 


~i(ti, ** +, %n). It may be obtained from (1) by replacing u; by 2. 
In constructing a test, it is convenient to treat the variables, 
u1, °° *, 2%, as the coordinates of a point in m dimensions and to use 


geometrical ideas. Since o? is unknown, it would be desirable to design 
a test that is independent of o?. Now from the theory of Neyman and 
Pearson! on best tests, one should search for a critical subregion of the 
desired size, say a, on each hypersurface 

0 log po 


(2) @ = —— = constant, 
0g 


such that every point of it and no others will satisfy the inequality 
Po(t1, +++ , Ln) 


(3) < k(¢), 
p(X, eee » 





where k(¢) is a constant depending on ¢ chosen to make the critical 
subregion one of size a. The region obtained as the locus of such sub- 
regions with @ as parameter then constitutes a best critical region of 
size a independent of o? for testing Ho against A. 


4, CONSTRUCTION OF CRITICAL REGION 


The preceding theory may be applied to (1). Differentiation of the 
logarithm of (1) will show that (2) becomes 


nr es i)? 
gn 24 2% u | 


o a 





consequently the system of hypersurfaces ¢=constant is the system of 
spheres 


>> (xi — us)? = constant. 


Furthermore, from (1) it follows that 


Po e—td (zi—ui)?/0? 





1 e—§Z (21-05)? /0* 
_ elle(Zzi(ui—vi)—§E (ui?—w,7)) 
‘herefore, upon taking logarithms, it will be found that (3) will be 
satisfied if 


'J. Neyman and E. 8. Pearson, “On the problem of the most efficient tests of statistical hypoth- 
ees.” Royal Society, London, Philosophical Transactions, Vol. 231, 1933, pp. 289-337. 
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(4) >? ri(u; — v3) < k’'(9), 


where k’(¢) could be expressed in terms of k(¢) and quantities not in- 
volving the z;, if desired. However, it is more convenient to consider 
the inequality 


(5) > (xi — ui)(us — v1) < e(¢), 


which is equivalent to (4) if c(¢) =k’(¢) —Sous(us—v,). The problem 
now is to determine the value of w such that the subregion (cap) on 
the sphere 


(6) > (x; — u;)? = r2 
that is cut off by the plane 
(7) Dd («i — u)(ui — v1) = w 


will be a subregion of size a. In order that (5) will be satisfied, it is 
necessary to select the cap of the sphere for which points will make the 
left side of (7) less than w. This means that the cap cut off must be of 
such a size that the conditional probability that the sample point will 
fall on this cap, it being known that the sample point lies on the sphere 
(6), will be equal to a. 

It will now be shown that the desired critical region can be found by 
means of Student’s ¢ distribution. The demonstration consists in trans- 
lating and rotating axes properly until the problem reduces to one that 
has been solved by Neyman and Pearson. Since a rotation of axes cor- 
responds to an orthogonal transformation of variables, it is merely 
necessary to select the proper orthogonal transformations. 

First, let y;=2;—u;, then (1) becomes 


e-12(y/o)? 





8 ) 1 _~ es ; Un = 
(8) Pol, Yn) (Om)*l2gn 


whereas (6) and (7) reduce to 


(9) DSy2@=r? and > ya; = wv, 


where @;=Ui—2;. 
Next, consider an orthogonal transformation of the variables y; to 
new variables t; such that 


Yai + +++ TF Yndn 





nV Sat 


and the r-maining ¢; are selected to make the transformation orthogonal 
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with Jacobian plus one. For such a transformation, > >y?= > t?; 
hence (8) becomes 


en §Z (tile)? 
10 (é a i = ——_—_, 
(10) Po(t (n)*ligs 


whereas (9) reduces to 


(11) } > t;? =r? and t/> a;* = u 


Finally, consider an orthogonal transformation to new variables 2; 
such that 





i = oy ~— = 5/n. 


Then (10) becomes 


e— 82 Gile)* 
12 (41,-++, 2.) =————» 
( ) Pott (22) "/2g” 


whereas (11) becomes 
13 “22=r? and 24/n)> a; = w. 
( lus 


Now Neyman and Pearson have shown that the proper value of w 
in (13) to yield a critical region of size a is a value of w such that 





(14) — Sa 








se loa, 


where fo is the value of ¢ found in tables of Student’s ¢ distribution for 
which P[|t| >t |=2a for n—1 degrees of freedom. Thus, for a=.05 
it is necessary to use the 10 per cent value of ¢ since only one tail of the 
t distribution is being used here. With this selection of w, it therefore 
follows from the discussion after (7) that the desired critical region 
will consist of that part of sample space for which the left side of (14) 
is less than —tea, because the left side of (14) will decrease when w de- 
creases. In terms of the original variables in (6) and (7), the critical 
region determined by (14) then becomes 


| Vn —1 > (i — u)(u — v;) 
Ny > (zi — ui)? >> (us — v4)? - iv a-ata-or 


If (15) is satisfied by the sample 2, - + - , tn, then Ho will be rejected 
in favor of H; at the significance level a. 





— laa. 
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5. NATURE OF THE TEST 


The preceding test should not be interpreted as a device for selecting 
one of two alternative formulas. As indicated in the introduction, the 
forecaster will ordinarily be reluctant to change over ‘to a new forecast- 
ing technique unless he has strong assurance that the new formula is 
superior; consequently he will be interested in a significance test of the 
type derived in the preceding section. If the problem were merely one 
of making a choice between two formulas without any concern for the 
chances of error, a simple criterion for making such a choice would be 
the ratio of the squares of the distances of the sample point from the 
two mean points, namely, 


> (x1; — u;)? 
> (x: — v3)? 


If this ratio were less than one, 7) would be preferred to H,, other- 
wise the converse would be true. Because of the spherical nature of the 
normal distributions of the z’s under H» and H, ,this choice would cor- 
respond to selecting Ho if the probability density at the sample point 
were greater under Hy than under H;. The distribution function of this 
ratio under Ho, however, is not independent of o? and therefore it would 
not yield a significance test of the desired type for the forecaster who 
insists on a significance test. 

In applying a test such as (15), the forecaster may easily err on the 
“status quo” side by employing a sample that is too sma!l to give much 
power to the test, with the result that Ho will be accepted with a rela- 
tively high frequency when H, is actually the more realistic hypothesis. 
It is not necessary, of course, to make H» correspond to the old formula 
and H, to the new formula. A forecaster might conceivably consider 
the rejection of a new formula that is actually better as a much more 
serious error than the rejection of an old formula that is better than 
the new formula, in which case he would interchange Ho and H, in 
order to control the more serious error at a desired level. In either 
situation, however, he should be concerned about the size of the other 
error. 





6. RELATED TESTS 


A test that is somewhat related to the preceding test was designed 
by Hotelling? by means of correlation coefficients. His test is concerned 


? H. Hotelling, “The selection of variates for use in prediction with some comments on the prob- 
lem of nuisance parameters,” Annals of Mathematical Statistics, Vol. 11, 1940, pp. 271-283. 
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with the selection of the best independent variables in a linear regres- 
sion equation. If wu and v are treated as two competing independent 
variables for predicting z by means of a linear function, Hotelling’s test 
could be applied, provided the necessary assumptions were made; how- 
ever there are several objections to such a procedure. Since correlation 
coefficients are unaffected by linear transformations on the variables, 
one of the competing variables could give rise to much larger errors of 
prediction than the other and yet yield a higher correlation coefficient.’ 
The assumptions necessary to apply Hotelling’s test are also rather 
stringent here. Since his test was designed for another purpose, it is 
not surprising that it is not satisfactory for the particular type of 
problem treated in this paper. 

? For example, the addition of a bias of a fixed amount for all 2; to the formula u would not affect 


the correlation between z and u but it would increase the errors of prediction, algebraically, by this fixed 
amount. 


USE OF VARIANCE COMPONENTS IN THE ANALY- 
SIS OF HOG PRICES IN TWO MARKETS 


R. L. ANDERSON 
Institute of Statistics, North Carolina State College 


This paper discusses various methods of analyzing the ratios 
of the daily hog prices at Cincinnati and Louisville and com- 
pares the results obtained with those previously given for the 
differences between the two price series. The data consisted 
of the prices for each of two weight classes on each the five com- 
plete market days of the week (Monday through Friday) 
averaged for each month throughout the period 1937-1941. 
Price ratios were used because of the high correlation which 
seemed to exist between the price differences and the price 
level. 

The analysis of variance was used to make statements about 
the consistency of the results. The paper discusses the as- 
sumptions behind the analysis of variance and how weil these 
assumptions are met with these data. In general, the most 
serious defect in making tests of significance is a lack of inde- 
pendence due to the intracorreiations existing between daily 
prices. 

The F-test was used to test for the existence of consistent 
fixed effects and to determine significant sources of random 
variation. Most of these tests were complicated by the com- 
posite nature of the error variances. An approximation method 
of H. Fairfield Smith and F. E. Satterthwaite was used to 
approximate the error degrees of freedom in such cases. Men- 
tion is made of a suggested new model which uses x? as a test 
criterion instead of F. For the ratios, most of the interac- 
tions with years were significant; also, the month fixed effects. 

A brief discussion is also included of the problem of estimat- 
ing the standard errors of the variance components, and of 
the uses of these variance components in formulating plans 
for future collection of data. 


INTRODUCTION 


N THE analysis of economic data, we are faced with two alternatives. 

Either we adapt to economic data the existing methodology which 
was developed primarily for the analysis of biological data, knowing 
that some unrealistic assumptions must be made and hoping that aber- 
rations from these assumptions do not seriously affect the analysis; or 
we develop an entirely new statistical methodology especially adapted 
to the intracorrelations existing in economic data and to the non-ran- 
domness of the sampling methods. The latter course of action should 
be pursued if it is mathematically and computationally feasible; how- 
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ever, we are constantly being asked to make immediate recommenda- 
tions, which cannot wait for the development of the new body of 
theory. This paper discusses one such problem. 

Data have been collected by the Agricultural Economics Depart- 
ment at the University of Kentucky under the direction of Dr. D. G. 
Card on the daily hog prices on the Cincinnati and Louisville markets. 
These economists were first asked to ascertain if the market price 
differentials for 1940 were consistent from month to month, from day 
to day and from weight class to weight class. Some information was 
also desired on the consistency of the month and day differences from 
class to class. As a first step, some tentative analyses were made on 
these differences, but since then some data have been furnished for the 
period 1937-1941. This paper will discuss the analysis of the more 
extensive data which cover this longer period. The original data con- 
sisted of average daily prices in terms of cents per hundredweight for 
the five complete market days of the week, Monday through Friday, 
and for each of two weight classes, 180-200 pounds and 200-220 
pounds. 

An analysis of the price differences was discussed in a paper pre- 
sented at a meeting of the American Statistical Association in Atlantic 
City, January 1947. Since then it has been decided to study the ratios 
of the two prices; this analysis has been carried on by using the differ- 
ence between the logarithms of the prices at the two markets. These 
log ratios are presented in Table 1 with the logarithms multiplied by 
10,000. All analyses will use these coded log ratios. 


TABLE 1 
DAILY LOG RATIOS OF THE PRICES OF HOGS AT CINCINNATI AND LOUISVILLE* 








200-220 lbs. clase 





180-200 lbs. class 


Mon, Tues. Wed. Thurs. Fri. Av. Mon. Tues. Wed. Thurs. Fri. Av. 
Jan, 88 108 121 104 103 104.8 Jan. 88 108 121 104 103 104.8 
Feb. 145 176 191 168 185 173.0 Feb. 145 176 191 168 185 173.0 
Mar 148 138 151 169 133 147.8 Mar. 150 138 151 169 133 148.2 
Apr. 156 179 141 140 170 157.2 Apr. 17 184 162 161 195 174.4 
May 133 145 132 144 113 133.4 May 172 179 171 178 153 170.6 
June 181 156 199 165 185 177.2 June 181 168 205 174 194 184.4 
July 208 207 239 245 214 222.6 July 208 207 239 245 214 222.6 
Aug. 201 231 252 230 260 234.8 Aug. 201 231 252 23 247 232.2 
Sept. 203 208 196 156 248 202.2 Sept. 209 212 196 156 248 204.2 
Oct. 122 163 196 155 185 164.2 Oct. 122 121 156 120 155 134.8 
Nov. 186 225 127 174 204 195.2 Nov. 113 146 108 94 112 114.6 
Dec. 166 194 174 148 157 167.8 Dee. 116 145 123 97 133 122.8 
Av, 161.4 177.5 181.6 166.5 179.8 173.4 Av. 156.2 167.9 172.9 158.0 172.7 165.5 





* The logarithms have been multiplied by 10,000. 
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1941 


Jan. 


Feb. 


Mar. 


Apr. 
May 
June 
July 
Aug. 
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Oct. 
Nov. 
Dee. 


Mar. 
Apr. 
May 
June 
July 
Aug. 
Sept. 
Oct. 
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Av. 


Jan. 
Feb. 
Mar. 
Apr. 
May 
June 
July 
Aug. 
Sept. 
Oct. 
Nov. 
Dec. 


1—(Continued) 








Mon. 


110 


11 
84 
141 


144. 


126 
143 


147 


6 


180-200 Ibs. class 
Tues. Wed. Thurs. 
63 61 126 
219 226 202 
221 232 235 
300 313 288 
226 213 204 
199 198 189 
202 192 194 
224 268 249 
153 185 154 
162 174 164 
166 163 145 
169 174 187 
192.0 199.9 194.8 
172 211 206 
201 178 178 
206 203 203 
239 258 216 
243 262 298 
275 244 254 
263 263 271 
211 196 211 
146 139 191 
165 157 156 
220 213 183 
178 213 206 
209.9 211.4 214.4 
212 222 195 
148 170 198 
155 139 109 
96 109 157 
208 242 182 
235 210 202 
158 154 154 
94 80 8&8 
—31 -—2 11 
13 31 51 
7 84 75 
45 151 150 
117.6 131.8 131.0 
143 121 104 
137 164 141 
158 151 127 
170 163 176 
185 196 195 
144 132 139 
114 109 120 
97 112 94 
73 93 100 
70 92 71 
114 100 91 
150 139 122 


Fri. 


97 
204 
233 
308 
246 
200 
204 
214 
159 
210 
156 
184 


195 
174 
161 
263 
242 


122 


117 
163 


2 


Av. 


91. 


213. 


226. 


304 


233. 
200. 


197. 


164. 
175. 


155. 
182. 
197. 


193. 
183. 


183. 


227. 
265. 
260. 
264. 


211 


162. 


132. 


135. 


125. 
151 


169. 
185. 
138. 
114 


83. 


105. 
144. 


Oem ROaoeroaeanwn SCeOeSenOnonwwo nw 


RDOehOSCHMBDFROSMREe w 


Oe rnekoaa 
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°o 


Jan. 
Feb. 


Mar. 


Apr. 
May 
June 
July 
Aug. 


Sept. 


Oct. 


Nov. 


Dec. 


Av. 


Jan. 
Feb. 


Mar. 


Apr. 
May 
June 
July 
Aug. 


Sept. 


Oct. 


Nov. 


Av. 


Jan. 

Feb. 
Mar. 
Apr. 
May 
June 
July 

Aug. 


Sept. 


Oct. 
Nov. 


Av. 


Jan. 
Feb. 
Mar. 
Apr. 
May 
June 
July 
Aug. 
Sept. 
Oct. 
Nov. 


200-220 ibs. class 

Mon. Tues. Wed. Thurs. Fri. Ay. 

7 61 2 100 64 2 
158 176 186 157 194 174.2 
222 221 228 235 233 227.8 
306 300 313 288 308 303.0 
276 226 213 204 246 233.0 
218 199 198 189 195 199.8 
193 202 192 194 204 197.0 
226 248 302 283 255 262.8 


221 +214 243 «4209 «6214 2992 
174 175 «187 164 203 180.8 
83 109 =s:117 76 74 9.8 | 


184 137 153 158 130 152.4 


196.5 189.0 196.8 188.1 193.3 192.7 


156 141 181 182 174 166.8 
211 260 241 231 244 237.4 
222 285 276 270 248 260.2 
303 331 346 297 308 317.0 
.23 302 311 354 327 323.4 
292 283 252 254 249 266.0 
234 263 263 271 291 264.4 
242 211 203 224 226 221.2 


143 146 139 191 174 158.6 


188 142 177 160 173 168.0 


221.5 227.3 229.1 230.8 234.2 228.6 


25 

292 253 277 306 287 283 
97 
27 


0 
4 
4 
147 171 196 218 209 188.2 
171 165 171 166 210 176.6 
175 171 197 166 190 179.8 


208.5 192.2 200.4 195.7 215.1 202.4 


126 159 140 120 155 140.0 
183 183 184 162 193 181.0 
167 185 178 127 160 163.4 
182 170 163 176 157 169.6 
171 185 196 195 179 185.2 
126 144 132 139 153 138.8 
100 114 109 116 122 112.2 
103 97 112 94 93 99.8 
67 73 102 110 91 88.6 
116 94 112 83 0 
66 92 144 104 1€2 101.6 
152 149 164 176 168 161.8 


129.9 137.1 144.7 133.5 139.8 137.0 











a 
Fri. Ay. 
64 70.2 
94 174.2 
33 227.8 
08 303.0 
46 = 233.0 
95 199.8 
34 197.0 
55 = -:262.8 
14 220.2 
93 180.6 
4 91.8 
30 152.4 
3.3 192.7 
4 166.8 
4237.4 
iS = 260.2 
8 317.0 
7 323.4 
9 266.0 
1 264.4 
6 221.2 
4 158.6 
l 160.4 
6 199.6 
3 168.0 
4.2 228.6 
9 269.0 
7 ~=—-283.0 
1 247.2 
7 132.0 
1 213.4 
D = - 221.8 
5 163.6 
5 174.4 
3 179.4 
) 188.2 
) 176.6 
) 179.8 
5.1 202.4 
1 140.0 
181.0 
) 163.4 
169.6 
) 185.2 
138.8 
112.2 
99.8 

88.6 
102.0 
101.6 
161.8 

8 137.0 
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PRELIMINARY INSPECTION OF THE DATA 


As pointed out above, the economists are interested in the answers 
to questions like the following: 


(a) Were there any consistent differences among the average yearly 
price ratios? 

(b) Were the average ratios for one class consistently greater than 
for the other, indicating that Cincinnati preferred one type of 
pork and Louisville the other? 

(c) Was there any evidence that the class averages were really dif- 
ferent for a given year but the ratios were not consistent from 
year to year? 

(d) Was there any consistent seasonal pattern as shown by the 
monthly averages? 

(e) Might the seasonal pattern change from year to year? 

(f) Might one market have consistently higher prices than the other 
on certain days of the week? 

(g) Are there perceptible differences in the conclusions if ratios in- 
stead of differences are used in the analysis? 


The first step in formulating answers to these questions is to compute 
the averages mentioned in the questions. First let us look at the over-all 
month, day, year, and class averages, which are presented in Table 2. 
In order to compare the analysis of the average log price ratios with the 
average price differentials discussed in my Atlantic City paper,' I have 
also included the latter in Table 2. Some minor errors were found in 
the data used in the original paper; they have been corrected in the 
averages presented in Table 2. 

From these averages, one might make the following tentative an- 
swers to the above questions. 


(a) It appears that the average annual price differences dropped 
sharply from 1937 to 1940 and started to rise again in 1941. This 
result indicated that the price differentials were functions of the 
actual prices, because the differentials tended to follow the 
average level of hog prices for the period under study. The av- 
erage hog prices (cents/Ib.) were 9.5, 7.7, 6.2, 5.4 and 9.1 for 
the years 1937-1941, respectively. On the other hand, the ratios 
of the two prices seem to be slightly negatively correlated with 
the price level. This would indicate that the use of price ratios 
may slightly overcorrect for the correlation between the price 


' “Differentials in Hog Prices,” delivered on January 24, 1947, at the Annual Meeting of the 
American Statistical Association in Atlantic City, N. J. 
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TABLE 2 
AVERAGE LOG PRICE RATIOS AND PRICE DIFFERENTIALS 














Months Log Ratio Difference Days of the Week Log Ratio Difference 
January 147.4 26.2 Monday 175.2 33.1 
February 194.3 36.2 Tuesday 174.0 33.1 
March 190.5 35.2 Wednesday 180.0 34.5 
April 208.9 38.1 Thursday 173.6 33.1 
May 215.6 39.1 Friday 183.8 35.0 
June 200.9 37.8 
July 192.1 41.3 Years 
August 185.9 39.2 1937 169.5 42.3 
September 146.6 32.0 1938 195.3 38.7 
October 137 .6 26.1 1939 220.4 35.5 
November 144.9 25.3 1940 168.7 23.2 
December 162.8 28.8 1941 132.6 29.3 


Weight classes 
180-200 Ibs. 169.4 32.6 
4 


200-220 Ibs. 185.2 


<) 
© 





(b) 


(d) 


(f) 


(g) 


difference and the price level. Both sets of yearly averages do 
indicate that there are important year to year price differences. 
The averages for the two classes are too nearly alike to be able 
to say if the lighter class was consistently cheaper, relatively 
speaking, in Cincinnati. It does not appear that the difference 
would be of any practical importance. 

Both the ratios and differences were decidedly smaller in the win- 
ter months than in the summer, indicating Cincinnati preferred 
heavier classes in the summer more than did Louisville. 

The daily averages differed so little that it is unlikely that one 
could adjudge the day-to-day differences to be of any economic 
importance. 

For all other averages, there is very little difference between the 
general conclusions based on the log ratios and on the differ- 
ences. 


In order to answer such questions as (c) and (e) regarding the con- 
sistency of the class and month averages from year to year, we need 
the averages for each year. These are given for the log price ratios in 
Table 3. 


(c) 


The comparative log ratios for each class were decidedly incon- 
sistent from year to year. The small difference between the over- 
all averages for the two classes is non-significant as judged by the 
change from the first two years to the last three years, but there 
is evidence of important class differences during some years. The 
same general conclusions held for the price differences. 
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TABLE 3 
AVERAGE MONTH AND CLASS LOG PRICE RATIOS FOR EACH YEAR 























Year 
Month - Average 
1937 1938 1939 1940 1941 

January 104.8 80.8 180.0 238.7 132.8 147.4 
February 173.0 193.6 210.3 228.2 166.3 194.3 
March 148.0 227.0 224.4 201.4 151.9 190.5 
April 165.8 303 .6 272.2 133.5 169.6 208.9 
May 152.0 233.0 294.5 213.4 185.2 215.6 
June 180.8 200.3 263 .0 221.8 138.8 200.9 
July 222.6 197.0 264.4 163 .2 113.3 192.1 
August 233.5 247.1 216.1 133.2 99.8 185.9 
September 203 .2 192.4 158.6 93.2 85.8 146.6 
October 149.5 178.1 160.4 108.3 91.5 137.6 
November 154.9 123 .7 209.0 133 .6 103.5 144.9 
December 145.3 167.2 192.4 156.0 153.0 162.8 
Weight class 

180-200 173.4 197.9 212.3 135.0 128.2 169.4 
200-230 165.6 192.7 228.6 202.4 137.0 185.2 
Average 169.5 195.3 220.4 168.7 132.6 177.3 





(e) There is evidence of a decided irregularity in the seasonal pat- 
tern, especially when comparing the first two years against the 
last two. Despite this irregularity there seems to be some evi- 
dence that the average log price ratio is smaller during the win- 
ter than during the summer months. Again the same general 
conclusions held for the price differences. 


These tentative conclusions should be checked by a more exact 
statistical procedure, which considers all sources of variability in de- 
ciding if a given set of log price ratios is really consistent. It is the pur- 
pose of this paper to present two methods of making this variability 
study, both based on the analysis of variance—one using the F-test and 
the other the ¢-test on the variance components. 


ANALYSIS OF VARIANCE 


These data seem ideally set up for a factorial analysis of variance as 
shown in the left side of Table 4. I have included an analysis of the 
price differences as well, but the discussion will mainly concern the log 
price ratios. The analysis of variance is a simple arithmetic device of 
dividing the total variation into separate independent parts. In order 
to use these separate components of variation, it is assumed that 


(a) The log price ratio for a given month, day, class, and year is esti- 
mated by the equation 











(,°9 Po *** * ‘go *Pe).30 04 BYdIIOSqNY » 








T870L 





1°29 OXAXWXd 
9z9'¢ OXAXNW 
“eh OXAXd 
€SL* Lz OXA 
“OL OX WX 
£6‘ OXNW 
0°69 oxd 
ors’ Ze sesse[D 


AMERICAN STATISTICAL ASSOCIATION 





OZ Gol AXNXa 
* £69 646 ‘ST AXK 
“08 v6 Axa 
*$16'9 £19‘ 6zI 8180 x 
“Or £89 wxad 
*g60'T cer'se sqwuOW 
0zI 8° 101 sre'z skeq 


NO 


9% 


KA 
z 
z 
z 
z 


09 





wwup ub op 2 fup fiw Ap eP sUsI0gIC o1nnsy wopedly UONBLIBA 
(,92) squeMOdUIOD SOUBIIBA Jo S]USIOTI0Z (orenbs uveul) 4 jo 890130q) JO dvinog 








SOLLVUY ZOld DOT AO AONVIUVA AO SISATVNV 
¥ ZIAVL 





Pijkt 


(b) 


(c) 


(a) 





VARIANCE COMPONENTS IN TWO-MARKET ANALYSIS 




































= m+d; +m; + (dm)ij + yx + yx + (my) x 
+ (dmy) ij + er +--+ + (dmyc)izr 


(= 1,2,---,5;7 =1,2,--++,12;k =1,2,---,5;1 =1, 2) 


m represents some over-all effect and is estimated by the general 
mean. m represents the added effect due to the first month and 
is estimated by the mean for this month less m. Similarly for the 
d,c, and y effects. These are generally called main effects. An effect 
like (my), measures the failure of m, m, and y; to completely 
account for the average log price ratio in the first month of the 
first year. Such an effect is called an interaction. From the data 
in Tables 2 and 3, we secure the following estimates: 


m = 177.3, m, = — 29.9, "= —78 
(my)11 = 34.8 


Certain of the effects in equation (a) above must be assumed to 
be randomly drawn from a large number of possible effects if 
the analysis of variance is to be useful in making tests of signifi- 
cance. Some of the effects may be merely fixed constants. The 
statistician must decide for himself on the basis of available 
information regarding the data being analyzed which factors 
should be considered random and which fixed (or systematic, as 
fixed effects are sometimes called). In this price ratio problem, 
it seems reasonable to assume that all of the main effects and the 
(dm) interaction are fixed constants, the latter because there are 
only 60 (dm) values which could be obtained and all of them are 
represented in our sample. Also since the 60 (my) values are 
continuous from January 1937 through December 1941, we shall 
assume the (my) interaction is also fixed. There is some question 
as to what we should assume regarding the (dy) and (dmy) inter- 
actions. For the time being, we shall assume they are representa- 
tive of an infinite population of such interactions; hence, assume 
they are random. 

If it is desired to make tests of significance, each random factor 
must be regarded as a normally and independently distributed 
variate with the same variance for every observation. 


Let us examine the nature of these assumptions for these data. 


Equation (a) assumes that the effects are additive. This assump- 
tion is equivalent to assuming that the effects are multiplicative 
in relation to the actual prices. This multiplicative relationship 
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was indicated by the positive correlation between the price dif- 
ferences and the price level. 

(b) Economic data are seldom normally distributed; however, slight 
non-normality is not considered to be a serious defect so long as 
the effects are independent. 

(c) The variance of prices is usually positively correlated with the 
price level, a serious defect if the analysis of variance is to be 
used. However, the use of the logarithmic transformation should 
remedy this defect. 

(d) The most serious defect with economic data is its lack of inde- 
pendence. Is it reasonable to assume that such an effect as (dc), 
is independent of (dc)s:? These effects could be independent only 
if the correlation between the daily averages for a given class is 
eliminated by the removal of the day (d) and class (c) fixed ef- 
fects. Unfortunately no adequate test has as yet been devised to 
test for the existence of serial correlation between the residuals 
after the removal of the effects of a linear factor. In the absence 
of any formal] theory on this topic, I am assuming that such ef- 
fects as (dc), and (dc); are nearly enough independent to make 
the analysis approximately correct. It should be noted that only 
the random effects are assumed independent. It is possibie that 
the use of the logarithmic transformation might also hel» correct 
for any lack of independence. This entire topic needs much 
theoretical investigation. Perhaps comparisons with known ran- 
dom series might be useful. 


A detailed discussion of these topics is given by Eisenhart, Cochran and 
Bartlett in the March 1947 issue of Biometrics. 

Assuming that the factors m, d, c, y, (dm), and (my) do represent 
fixed effects and that all other factors represent random effects, it can 
be shown that the mean squares given in the analysis of variance table 
(Table 4) are estimates of the variance components given on the right 
side of this table. o4*, om?, o,7, 0-7, cam’, and cm? are merely dummy 
symbols which represent sums of squares of differences between the 
fixed effects. All of the other o;* represent random variances associated 
with the random factors, with assumed infinite populations. A good 
discussion of the determination of these coefficients is given by H. E. 
Daniels [2].2 A recent discussion based entirely on random components 
is given by 8S. L. Crump [1]. It might be mentioned that oap,’, for ex- 
ample, does not occur with the variance components for the day and 
month mean squares because it is a fixed effect. 


? Daniels uses a somewhat different notation for the fixed effect from that used in this paper. 
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F-TEST 


Two methods of testing the hypothesis that ¢,7=0 are available. The 
most widely used is the F-test (sometimes called the variance ratio 
test) or Fisher’s z-test. F is defined as the ratio of two independent 
mean squares, each of which is distributed as x*e?/n, where n is the 
number of degrees of freedom for x’. 


N1X2"02" V2 


F = ———— = —; 
N2x1°93" V; 
where n; is the number of degrees of freedom associated with the mean 
square V; and = stands for “is estimated by.” o;? is called the error 
variance. If the hypothesis to be tested provides that o,?=<¢,”, then F is 
independent of the variance parameters, a condition necessary for any 
test of significance which does not specify the value of these parameters. 
The significance levels of F have been derived assuming that the only 
alternative to the hypothesis o;?=<¢;? is that o?>o;,’. 

a. Test of Existence of Real Random Variation. As an example of the 
use of the F-test to test for random variation, consider the test of the 
null hypothesis that on,.2=0. From Table 4, we see that the best esti- 
mates of o;? and 2? are Vamye ANd Vnye, respectively, because they are 
both estimates of the same variance oam,.? under the null hypothesis and 
o> 01? when ony. >0. That is, Vinye is an estimate of 


~ e 
OO myc” + Camus? 


while Vamye is an estimate Of Gamy-?; therefore, the expected value of 
Vanye is at least as great as that of Vamye. Hence F = Vye/V amye = 3526/ 
62.1 =56.8 with 44 and 176 degrees of freedom. 

Vamye iS the error variance for testing for the existence of all three- 
factor interactions. The results for all four of these three-factor inter- 
actions are given in the following table: 





Interaction MYC DYC DMC DMY 
V: 3526 42.3 70.4 755 
P 56.8 0.70 1.13 12.2 
P <.001 >.50 >.20 <.001 





P is the probability that a value of F as large as or larger than the com- 
puted F could have been obtained from a random sample with the given 
interaction being non-existent. It is apparent that the same test could 
be used if the three-factor interactions were assumed to be fixed ef- 
fects. In this case the null hypothesis would be that these effects were 
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equal. It is evident that the DYC and DMC interactions are not sig- 
nificant, while MYC and DMY are highly significant. 

The problem of testing for the existence of the random two-factor 
interactions is complicated by the composite nature of the error vari- 
ances. As an example, consider the problem of testing the null hy- 
pothesis that o,.2=0. Following the procedure set up above, we let 
Vo=Vye. If we set o12?=5 Omy2+12 Gay? +oamye?, then o;? =o,” under the 
null hypothesis o,.2=0, and o?><0;? if o,.2>0. The difficulty is that no 
single mean square is an estimate of o,’. 

Two alternative methods of making the test of significance are avail- 


able: 


(i) Make a composite null hypothesis, cay? =0 and o,.2=0. Now V,, 
is an estimate of 5 omyc?+-camyc?, Which is also estimated by 
Vyee Hence F = Vyc/V myc = 27753/3526==7.87 with 4 and 44 
degrees of freedom, a highly significant value (P<.001). The 
test of the original hypothesis (c,.2=0) is probably more power- 
ful if the composite hypothesis includes ca.2=0 rather than 
Omyce =0, on the basis of the tests of the three-factor interactions. - 
In this case, I doubt if anyone would be too perturbed about 
making the assumption that oa,.2=0, because Vay. is much less 
than Vamye- 2 

(ii) Estimate the value of o;? as follows (o;2=V;): 


o;" = Sage" + Cange* + 12¢ aye” + Cange” —_ Cange* 
Vi = Venye + Vaye — Vamye = 3507. 
The problem is to determine the number of degrees of freedom 
in this estimate of o,*. F. E. Satterthwaite [3] has extended a 


result of H. Fairfield Smith [4] to approximate the number of 
degrees of freedom in V,. In general if 


Vi =aVu taVet+---, 
the approximate number of degrees of freedom in V;, is 


V:2 
i= (a,V1)? (a2V 42)? 
+ ——-— + 


fi fe 











where f; is the number of degrees of freedom for V;;, and ail 
a;= +1 inour problems. Hence for the test of the YC interaction, 
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t sig- f (3507)? 
(3526)? (43)? (62)? 
actor 44 16 176 
Vari- ) 40 
| hy- = 43.5 
e let 
r the and F’ = 27,753/3,507 =7.91 with 4 and 43.5 degrees of freedom. 
at no This F’ is not distributed according to the F-distribution, but 
the approximation is probably not too bad. 
ail- These two methods give approximately the same results for 
testing the DY, DC and MC interactions, except for a decided 
v View reduction in the number of error degrees of freedom for DC by 
1 by the second method. The results using the two methods to test 
d 44 for these interactions are given below. P for the first three in- 
The 
yWer- Composite Hypothesis Approximate Method 
than V: Vi df. P Vi df. F’ 
ions. 
DY 755 (DMY) 176 1.21 736 160. 1.24 
bout Dc 70.4 (DMC) 44 .98 51.6 10.6 1.34 
- less MC 3526 (MYC) 44 1.40 3435. 44.2 1.40 
yc 3526 (MYC) 44 7.87 3507. 43.5 7.91 





teractions is >.20. These results using log price ratios coincide 
almost exactly with those obtained from an analysis of the price 


differences. 
vn b. Tests of Existence of Differences Between Fixed Effects. 
ed a 
r of (i) The composite hypothesis for the MY interaction involves the 


necessity of assuming that either cam,? OF omy. is 0, when we 
have already shown that the estimates of both were so large 
as to indicate the true variances were greater than 0. It has 
been advocated that we use the larger of the two mean squares, 
Vamy 2Nd Vay, a8 the estimate of o;*. In our case, this would be 
assuming o;?=V»y-=3526, and F=15949/3526=4.52 with 
44 and 44 degrees of freedom, a highly significant value (P 
<.001). This procedure actually involves assuming that 
Tam? =0, a quite unrealistic assumption as shown by the pre- 
vious analysis of the DM interact’on. In this case the approxi- 
| all mation method of Satterthwaite seems to be somewhat more 
realistic. We have 
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o;? = Qo amy? + So mye? + Cauge* 


dmy + Vem — ¥ dues 
755 + 3526 — 62 = 4219. 








Hence 
(4219)? 
fa = 62.3 
755)? (3526)? a (62)2 
176 44 176 


and F’=15949/4219=3.78 with 44 and 62.3 degrees of free- 
dom, also giving P<.001. 

(ii) The DM interaction is undoubtedly non-significant, because 
its mean square is less than for the DM Y interaction. 

(iii) The complications mentioned above in testing the two-factor 
interactions are intensified when we try to handle the tests 
of the hypotheses that o.?=0, o,,2=0, etc. By making composite 
hypotheses of the type used above, it is easy to test the C,Y, 
M, and D effects by simple F-tests with error terms given by 
the mean squares for YC, YC, MC, and DY, respectively. If 
the approximation method mentioned above 1s used, the esti- 
mates of the error variances are: 


C: Vac + Fun + Ves = F aus = v aes _ Vom + V dmyc 
Wh Vay + Vue — Vaye 

M: Fon + v ou aes dmyc 

D: V ay + Vac _ V dye: 


The results using the two methods to test for these main ef- 
fects are given below. 











Composite Hypothesis Approximate Method 
Effect -— — 
Vi d.f. F Vi d.f. F’ 
Cc 27 ,753 (YC) 4 1.36 29,188 4.4 1.30 
Y 27 ,753 (YC) 4 4.67 28 ,624 4.3 4.52 
M 4,943 (MC) 11 7.78* 5,636 14.3 6 .82* 
D 914 (DY) 16 2.57 940 16.5 2.50 





* P approximately .001. 


Except for the month effect, all of the main effects are non- 
significant. The significance probability for both year and day 
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effects is about .10. The results are quite similar for the two 
types of analysis. 

(iv) The same general conclusions were reached using the price dif- 
ferences instead of the log price ratios with the exceptions of the 
tests of the year and day effects, for which the significance prob- 
ability was less than 5 per cent. The results are given below for 
these price differences. 

















Composite Hypothesis Approximate Method 

Effect — 

Vi: df. F Vi df. F’ 
MY 100.5 44 6.90T 123.3 65.2 5.63t 
Cc 584.4 4 1.39 633.3 4.6 1.28 
Y 584.4 4 11.94* 613.3 4.4 11.37* 
M 149.7 11 11.33t 172.5 14.6 9.83t 
D 30.6 16 3.33* 30.5 15.7 3.34 
* P<.05. 
+ P<.001. 


(c) Summary. Let us compare the results of these tests of significance 
with the tentative conclusions advanced earlier. 


(i) The differences between the yearly price ratios, although ap- 
preciable, were not consistent enough from class to class to ad- 
judge them to be significant; however, there were highly signifi- 
cant year effects for the price differences. These results indicate 
that the main cause of fluctuations in the comparative prices of 
the markets from year to year probably was the general price 
level itself. 

(ii) The day-to-day fluctuations were almost the same for the two 
analyses. There was a slight but rather consistent tendency for 
comparatively higher prices in Cincinnati on Wednesday and 
Friday. 

(iii) There seemed to be a consistent seasonal! effect, which was al- 
most the same for both the price ratios and price differences. 

(iv) There was no evidence of any real class-to-class differences. 

(v) The highly significant YC and MY interactions confirmed our 
original premises. 


One other point should be emphasized. All of the interactions with 
the year effects were significant except DY and DYC, while none of the 
other interactions was significant. If the data had not: been tabulated 
on a yearly basis or if only one year’s data were available, no estimates 
of the year interactions could have been obtained. Under this condition, 
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such interactions as MC and DM would have been adjudged highly 
significant. In other words, the main sources of variation were associ- 
ated with the year effects; hence, quite fallacious conclusions regarding 
sources of variability might be drawn from data for which the inter- 
actions with the year effects could not be obtained. 

This general conclusion regarding the effect of the interactions with 
the year effects might be of paramount importance in making recom- 
mendations on economic policies to reduce discrepancies between the 
prices on different markets. Without a knowledge of the YC inter- 
action, the difference between the class averages might have been 
judged significant, although in this case the difference was too small 
to be of much economic significance. 

The main cause of fluctuations in price differentials apparently 
is the year-to-year changes which in turn appear to be highly cor- 
related with the general price levei. By the use of price ratios, we seem 
to have removed to a large extent the effects of the general price level 
on the analysis. 

Finally, one might be interested in knowing the effect of assuming 
that the (dy) and (dmy) interactions were also fixed. If this assumption 
were also made, all effects except the interactions with C wouid be fixed. 
Hence the error terms would be quite simple for testing all of the fixed 
effects except C. In fact, the error mean square would be exactly 
(Effect) XC in all cases. There would be no change in the tests of the 
effects having C as a part of the variation. The results for the fixed 
effects except C would then be (for the log price ratios): 


Effect Error Mean Square F 

DMY 62.1 (DM YC) 12.16 
MY 3526. (MYC) 4.52 
DY 43.2 (DYC) 21.16 
 f 27,753. (YC) 4.67 
DM 70.4 (DMC) 8.28 
M 4,943. (MC) 7.78 
D 69.0 (DC) 34.03 


A)l effects are highly significant except the year effoct, for which the 
analysis is not changed by the new assumption. The main change is the 
significance of the day effects and of the interactions with days. 


AN ALTERNATIVE TEST OF SIGNIFICANCE 


A more sound statistical procedure for testing the hypothesis that 
such a variance as o,”=0 might be to find some other test criterion 
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than the F-test. If we let E(V,-) =01?, E(V amyc) = 02%, E(V aye) = 03%, and 
E(V mye) =o, Our null hypothesis might be 

oy? + a2? = 03? + a? = 0°. 


If we make no restrictions on the alternative hypothesis, we find that 
the likelihood test criterion becomes 


a 
d= [J (Vi/0,)™!? < ro. 
t=1 


where 6; is the maximum likelihood estimate of o,. Referring to Bart- 
lett’s x?-test for homogeneity of variance, it seems reasonable to as- 
sume that 


— 2log\ = > n, log (6;/V;) = — log Xo 
is approximately distributed as x? with 1 degree of freedom, so that 
we Can use as our rejection region 
x? > xo? = — log Ao. 
6; are solutions of this set of equations 
L for «+=1,2 
. for «1 = 3,4 
L = d/d) 04/ns, 
where d= V,+V2—V3— V4. 
An iterative method was used to solve for the 6; with the following 
results: 


n(V; om 6;)/0;? = 


Mean Square 


YC DMYC DYC MYC 
n 4 176 16 44 
V 27753 62 43 3526 
6 5540 61.94 43 .33 5559 
6/V . 19962 . 99903 1.0077 1.5766 


p is n; log (0;/V;) = 13.53. 
P(x? > 13.53) < .001. 
It should be noted that this is a two-tailed test; hence, the signifi- 


cance probability would be expected to be about twice that obtained 
by the F-test. The following results were obtained for other effects: 
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Effect Xo" P(x?> x0") P(F) 
MY 20.95 <.001 <.001 
Y 2.34 .14 .08 
C 1.60 .20 .10 


Some work is now being done by R. A. Porter on the properties of 
this method of making the significance test. Since the method is in a 
very “rough” stage, I am not advocating its use now but it merits fur- 
ther investigation. 


VARIANCE COMPONENTS 


Another method of testing for significant sources of variation and 
for significant differences among fixed effects is to compute estimates 
of the variance components (a4, Om”, - * +, Camye?) and standard errors 
of these estimates. It is quite easy to determine unbiased estimates of 
the variance components but the standard errors become quite com- 
plicated for the fixed effects. We can construct unbiased estimates 
of these standard errors but it has not been proven that these esti- 
mates are “best” in the maximum likelihood sense. It should be stated 
that we do not assume that the variance components are normally dis- 
tributed when a test of significance is not required. 

The estimate of any variance component is found by subtracting 
the error variance for any effect from its own mean square and 
dividing by the appropriate constant as given in Table 4. The com- 
plete results are given in Table 5 below but afew examples using the 
log price ratios are presented to illustrate the method of computation. 


Camye? = 62.1 
Omye? = (3526 — 62)/5 = 692.8 
Gye? = (27753 — 3507)/60 = 404.1 
o,? = (129,513 — 28,624)/120 = 840.7. 
There is some question as to what are the best estimates of o, and 
ay, because the estimated value of cz,-? is negative. Since no variance 
component can be negative, it is often assumed that the estimate of 


Taye? 18 negative because of sampling fluctuations alone and that the 
best estimate of cay? is 0 instead of (—18.8)/12= —1.57. We see that 


E (Vue) = Camye? + 5omye? + 12caye? + 60,7. 
Hence 


V aye + V amye) /60 = 404.1. 





Cuye* = (Fas = V ans _ 
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However, if we assume oa,? =0, 


Tye? = (Ve — Venye)/60 = 403.8. 


Similarly 


oy” = (V, _- Vay _ Vee + V aye) /120 == 840.7. 


If we assume oq,.?=0, 
oy? = (Vy — Vay — Vue + Vamyc)/120 = 840.9. 
As will be seen later from the standard errors of these estimates, 
the difference between the two results is negligible. No general state- 


TABLE 5 


ESTIMATES OF VARIANCE COMPONENTS AND STANDARD ERRORS 
OF THESE ESTIMATES FOR THE LOG PRICE RATIOS 














Mean Squares Variance Components 

Effect D.F 

Vv st(V)* Estimate S.E.* 
D 4 2,348 1,812,600 11.7 11.5 
M 11 28,435 73 ,680 ,000 656 .Of 176. 
DM 44 583 16,118 —18.0 15.1 
7 4 129,513 3 ,425 ,930 ,000 840.7 506. 
DY 16 914 92 ,822 7.4 13.1 
MY 44 15,949 5,332 ,940 1173 .0f 242.5 
DMY 176 755 6,405 346 .4¢ 40.1 
Cc 1 37,810 3 ,232 ,600 ,000 28.7 197. 
DC 4 69.0 1,587 0.3 0.8 
MC 11 4,943 3,759 ,000 56.3 82.9 
DMC 44 70.4 215 em 3.2 
YC 4 27 , 753 256 ,743 ,000 404.1 267 
DYC 16 43.2 207 —1.6 1.3 
MYC 44 3,526 540,550 692 .8t 147 


DMYC 176 62.1 43 62.1 





* s*(V) =estimated variance of V. 

** S.E. =Standard Error. 

t Significance probability <.001. 
ment can be made as to the better procedure; however, if the two re- 
sults are decidedly different, the experimenter is advised to check his 
least squares model to see if aberrations from it, such as the existence 
of serial correlation, could have produced the discrepancies. 

The estimates of the variance components are computed by adding 
and subtracting one or more of the independent mean squares given in 
the analysis of variance table and then dividing by some constant. 
In general, we may indicate the result as follows: 


o.2 = y a:V «/ke, 
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where a,” is the variance component, k, the divisor for c, a;= +1, and 
V;=some mean square. For example, 
1200,? = (Vy — Vay — Vue + Vaye)- 
Since the V; are independent mean squares, the variance of the 
estimate of k.o,? is given by 
>> Variance (Vj). 
If V; consists of only random components, V;=x?o,7/f;, and 
o°(V;) = E(V; — o,?)? = EV? — of = 20;4/f;, 
where o,?= V;. Hence 
iV? " 2V,? 
= —— and o7(t i) &———- 
fi +2 fi +2 
if V; consists of both random and fixed components, the problem 


becomes much more complicated. In this Case the mean square, Vj, 
is an estimate of 


0; 


Cy" + k,o,? 


where a,’ is the random component and a,? the fixed or systematic com- 
ponent, and k, is the coefficient of o,? in the analysis of variance. 
H. E. Daniels [2] has shown that the variance of V; is given by 

2o4 4k |. | 40,7 | 2 
— ae (o,? + k,o,") sae llees 8 a,'. 

fi fi fi fi 

co, can be thought of as the sum and difference of several inde- 

pendent random variances, each estimated by a single mean square in 
the analysis of variance. 








o, = » a;o;* 
where 
a;= =1 and oa; = V;. 


Let Vi=V,+V,, where o,?=V, and k,o,2=V,. Since the V’s are 
all independent 


V,= Z a;V; 
and 





E(V;V,) = EV.EV, = o,*(0,? + k,o,*). 
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Similarly 
of = Dios t+2>) diaaosoe (7 <k) 
= —_ 
=) ee Vi +22) Di aaVyVi, 
since 
+2 
EV? =o; PBL ; and EV;V; = a;*o;’. 


Collecting terms we find that an unbiased estimate of the variance 
of a mean square V; consisting of both random and fixed components 


is 





|x Is_ ys £+2D Daavy, | 


2 
a “i. f; > 2 i<k 


where 
J r= po a;V;. 
The variance of each mean square is given in Table 5. In order to 
illustrate the computations, consider the following examples: 


(i) MYC: This mean square consists of only random components. 
Hence the variance is 


2(3526)? 
—— = 540,451. 
46 
(ii) YC: Again the mean square consists of only random com- 


ponents and has a variance of 


2(27,753)? 
—— = 256,743,000. 








(ili) Y? This mean square consists of both a random and a 
systematic component 
V; = 129,513 

914 + 27,753 — 43 = 28,624 


—_ 
J 
- 
Il 


es ) 

_— 
to 

i 





(914)? + ; 27,753)? + 7 (43)? 
fit +2 ° 18 \ . 6 ( 4,408 18 t 
742,000 + 513,486,000 + 2,000 = 514,230,000 


= 914(27,710) — 27,753(43) = 24,134,000. 


M 
M 
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The variance of V; is given by 
4 2 
- (129,513) (28,624) — — (514,230,224 + 48,267,122) 


= 3,425,930,000. 
The variance of the estimate of k,o,? is 
> Variance (V,). 


The estimates of co and the standard errors of these estimates are 
given in the right side of Table 5. Again a few examples: 








(i) — Omye? = 3464/5 = 692.8 S.E. = — V540,551 + 43 = 147 
(i1) oye? = 24,246/60 = 404.1 S.E. = — «B57, BEV = 267 
(iii) oy? = 100,889/120 = 840.7 

S.E. = — \/3,425,930,000 + 93,000 + 256,743,000 + 0 





1 
: — 4/3,682,766,000 = 505.7. 
120 


If a variance component is estimated from enough degrees of free- 
dom, the estimate can be treated as a norma] deviate with the standard 
error as given in Table 5. However, most of the critical effects have too 
few degrees of freedom for such a test. The test for month differences 
assuming a normal distribution gives a significance probability quite 
similar to the results given by the F-test. The test of significance for 
the YC interaction is quite different from that arrived at by the F-test. 
But we note that there are only 4 degrees of freedom for the YC mean 
square. 

If more weight classes were available, the component analysis for 
YC should give a good test of significance of this random component. 
The tests of the other random components give about the same signifi- 
cance probability with the F-test and with the variance components 
test. It is doubtful if the components analysis is very useful for testing 
the main effects, because of the limited number of degrees of freedom. 
However, the results in this case were in close agreement with those 
obtained above using the F-test. 
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The most important use for the standard errors of the variance com- 
ponents is in setting approximate confidence limits on the random 
variances. If we use +2 S.E. as the 95% confidence belt for these 
random variances, we can see how much our composite null hypothesis 
might have erred in assuming certain of the random variances were 0. 
For example, cay? would not be expected to exceed 2, a negligible 
amount in testing the hypotheses that 


Cus", Cade’, Ody’, o", Sz", Ca" = 0. 


Similarly the assumptions that cam? and oa?=0 could not be very 
serious in testing the hypotheses that om-?, cam’, o27, Om?, o4 =0. 

The random variances can be used in deciding the most efficient 
method of gathering future samples or of increasing the size of the 
sample. Since efficiency is inversely proportional to the variance, it is 
generally most efficient to sample where the variability is greatest un- 
less costs or other external factors prevent this. In our price data, it 
seems reasonable to recommend that future data should take in as 
many years as possible. It probably would not be necessary to use data 
for every month of the year or every day of the week. Apparently 
monthly averages would suffice. As an example of the use of the vari- 
ance components to estimate the variances of average price differences 
if more data were secured, consider the average log price ratio for a 
given month and weight class over n years and p days per week, giving 
np log price ratios for each average. The variance of this average is 
given by 


Oye? + Omye? — Ode? + Same? — Say? + Samy? + Cayo? + Camye” 


n p np 





This is estimated by 





404.14+ 692.8 03+1.7 74+ 346.4 — 1.6 + 62.1 
+ + —_— 
n ? np 
1096.9 2.0 414.3 
comaname of fo ‘ 
n p np 


Vip) = 








Hence we conclude that this variance can be materially reduced only 
by increasing the number of years included in the data. Under the 
present plan of 5 days and 5 years, V=219.4+.4+16.6=236.4. 
If 1 day and 25 years were included, we would expect V to be about 
43.9+-2.0+16.6=62.5 with the same amount of data, assuming the 







































634 AMERICAN STATISTICAL ASSOCIATION 


variances associated with year effects did not increase with an increase 
in the number of years. This paper was not written to discuss data- 
gathering procedures, but it seemed pertinent to emphasize that the 
variance components are quite useful in deciding how and where to 
secure the data. Crump [1] has discussed this problem in detail. 
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THE RELATION OF CONTROL CHARTS TO ANALYSIS 
OF VARIANCE AND CHI-SQUARE TESTS 


Henry Scuerré 
University of California at Los Angeles 


The following corrections should be made in the article published 
under this title in the Journal of the American Statistical Association, 
Volume 42, No. 239, September 1947. 

p. 427. Fourth line from top, replace é\/n by ¢/+/n. 
p. 427. Second line from bottom, replace S. by S;. 
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BOOK REVIEWS 


Edited by 
Oscar KrisEN Buros 
Rutgers University 


Personnel and Training Problems Created by Recent Growth of Applied Statis- 
tics in the United States. Committee on Applied Mathematical Statistics of the 
National Research Council: Luther P. Eisenhart (Chairman), Samuel S. Wilks, 
Chester I. Bliss, Edward U. Condon, Harold O. Gulliksen, Lowell J. Reed, Charles 
F. Roos, Walier A. Shewhart, Hugh M. Smallwood, and Frederick F. Stephan. 
Reprint and Circular Series, No. 128. Washington 25, D. C.: National Research 
Council (2101 Constitution Ave.), May 1947. Pp. v, 17, Paper. $0.25. T'wo re- 
views follow: 


REVIEW BY GEORGE W. SNEDECOR 
Research Professor, Statistical Laboratory, lowa State College 


N THIS 17-page closely packed reprint, the Committee examines the pres- 
] ent shortage of competently trained statisticians. This dearth has its or- 
igin in the rapid development, during the past ten years, of applied statistics, 
notably in these fields: (a) industrial statistical control, (6) research in the 
biological sciences, (c) collection and analyses of government statistics, (d) 
market research and commercial sampling surveys, and (e) psychological 
testing. Evidences of an acute condition are found in the rapid growth of 
statistical societies, and the demands for statistical personnel far in excess 
of the supply. 

The complicated part of the Committee’s recommendations is concen- 
trated under the heading, “Problems of Education and Training.” Here are 
considered the needs of three groups: (a) mathematical statisticians and 
(b) applied statisticians, the major interests of these two groups being sta- 
tistical work; and (c) research workers in other fields who not only use sta- 
tistics as a tool but act as statistical consultants to their colleagues. The 
Committee boldly takes the stand that “Students in all three groups require 
training beyond the elementary level in mathematics, in mathematical sta- 
tistics, in applied statistics and in some field in which statistics can be ap- 
plied.” The reviewer warmly commends this position; it should be considered 
not only by those who think mathematics is unnecersary to the statistician, 
but even more particularly by those who prefer to keep mathematical 
statistics “pure,” unsullied by applications. 

Where may students find suitable curricula in statistics? Among 27 uni- 
versities polled by the Committee, ten claimed a graduate program leading 
to the Ph.D. degree in mathematical statistics. Of these, two have depart- 
ments of mathematical statistics, two have departments of statistics, and 
four have committees for coordinating advanced training in mathematical 


635 








636 AMERICAN STATISTICAL ASSOCIATION 


and applied statistics. The situation in applied statistics is little better. In 
only fourteen of the 27 institutions was an adequate training program indi- 
cated. One university alone had a sufficient array of stipends to attract 
graduate students. 

Little comfort can be found in the character of the courses offered or in the 
teaching personnel. “Introductory statistics is taught in many universities 
at upper class or graduate levels,” whereas the material is more suitable for 
freshman or even for secondary schools. Moreover, “These introductory 
courses are often taught by personnel for whom statistics is of only minor 
interest. Far too many have never studied statistics themselves beyond the 
level presented by the course.” All of this furnishes a dim prospect for us 
optimists who have been talking about raising the professional standards 
of statisticians. First, vested interests will have to be breached and appropri- 
ate curricula installed; next, new courses will have to be devised; then, 
competent instructors must be trained and retained in the face of competi- 
tion from government and industry; and finally, superior students must be 
enlisted during their early collegiate years. Even unto the third and fourth 
generation it seems likely that most of the statistical work of the country will 
be done by intuitionists with little or no formal training. 

The Committee, however, does not take a pessimistic view. On the con- 
structive side, there are many thoughtful suggestions, perhaps the most 
challenging of which are centered on the development and administration of 
a basic introductory course, preferably at the freshman level. This course is 
to be adapted not only to students expecting to major in statistics but to 
those who will need it in their later work in the natural and social sciences. 
The course is to provide the essential training on which can be erected the 
specialized courses in statistics following it in the several applied fields. 

It is a temptation to the reviewer to include more of the many quotable 
passages from this pamphlet. But its price is so small ($0.25) that maybe 
enough has been said to stimulate interested readers to send for a copy. 


REVIEW BY HOLBROOK WoRKING 


Economist and Professor of Prices and Statistics 
Food Research Institute, Stanford University 


N OUTSTANDING problem at’ present,” says this report in a foreword, “is 

the development of personnel for the teaching of statistics and for the 
application of statistical methods in a wide variety of fields . . . . The situa- 
tion deserves the earnest attezntion of college and university administrators 
and individuals concerned with the teaching of statistics. This report is 
being sent to various societies, organizations and agencies concerned with 
the teaching and development of mathematical statistics and its applications 
with the expectation that steps will be taken toward the solution of the 
problems discussed by the Committee.” 
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A distinguished committee has undertaken here to crystallize the best 
current thought on an educaticnal problem of major importance. A reviewer 
must be tempted to confine his remarks to emphasis on the high qualifica- 
tions of the committee and on the need for action on its recommendations; 
but in these pages even such a document as this deserves a critical review. 

The report undertakes at the outset to distinguish between “mathe- 
matical” and “applied” statisticians, and subdivides the latter category into 
“general” and “special-field” applied statisticians. What emerges, as often 
happens from attempts to define concepts which have “just growed,” is 
evidence that the terms would be better abandoned in serious discussion of 
training in statistics. Perhaps some day the term “statistician” will take on 
as specific (and as uncertain) a meaning as “doctor” has now. If so, statis- 
tician, in the sense of a person concerned with statistical theory and method 
as a full-time profession, must mean what we now try to designate by 
“mathematical statistician;” and it will go without saying that such a person 
isin an applied field, else he should be called a mathematician rather than a 
statistician. For the distinctions which must be drawn, I like terms recently 
suggested by Carlos E. Dieulefait and Robert Guye. They recognized three 
categories of statistical knowledge and activities: statistical theory; statistical 
methodology and administration; and statistical analysis. Specialists in these 
lines might be called respectively: statistical theorists; statistical technicians 
or administrators; and statistical analysts. 

Many users of this report will value its seven pages on growth of interest 
in statistical methods, fields of employment, and the present demand for 
“statisticians,” but we may turn at once to its discussion of “problems of 
education and training.” The first section under that head, dealing with 
opportunities for advanced training, includes useful information on recog- 
nized requirements and on the scope of training facilities available in uni- 
versities and elsewhere. Further discussion of the topic would be helped, 
I think, by use of explicit terms in place of “applied statistics” and “applied 
statistician.” Those expressions are sometimes employed restrictively, im- 
plying deficiency in theory, and sometimes expansively, to include the art 
as well as the “science” of statistical analysis. Such ambiguity tends toward 
confusion. 

The main recommendations of the report are developed in the sections on 
“the teaching of elementary statistics” and “the statistical laboratory in 
teaching and research.” Here my critical sense seems to fail; I can only ap- 
plaud. Those who object to the proposal for a common elementary course to 
serve all (or nearly all) classes of students should reflect on how much the 
intelligent use of statistical methods depends on a body of basic statistical 
concepts, and consider how greatly the teaching of their specialized depart- 
mental courses would be facilitated if students came to them with some 
preliminary grasp of the concepts. To objections that the concepts can only 
be conveyed along with the methods, and by mathematical proofs, the 
answer is that teaching of the concepts through experimental demonstration 
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has been well tested, and found in practice a more effective and much less 
painful method. 

The development of a good common elementary course in statistics will 
not come quickly, for there are many problems of content and method to be 
worked out. The experiment should be tried at a few institutions and watched 
closely by others. Some institution should try developing the course as a 
cooperative effort by instructors from several departments. It was such 
pooling of ideas that largely accounted for the effectiveness of the war-time 
intensive course in statistical quality control, which, like the introductory 
course proposed in this report, was much more concerned with ideas than 
with technical methods. 

When a good common introductory course in statistics at the college level 
has been developed, it will probably have to be changed very soon. The 
secondary schools will appropriate much of it and students will come to 
college already possessed of ideas that we labor long to convey now at a much 
later educational stage. 

On the topic of “organization of a statistics program within a university” 
I can resume my role of critic. There is a place in most universities, I be- 
lieve, for a university coordinating committee on statistical instruction. 
Courses dealing with statistical theory and methodology will probably con- 
tinue to be taught in several departments even at most institutions which 
have a special department of statistics, and few such institutions can well 
afford to have the courses uncoordinated. But the report is undoubtedly 
correct in implying that such coordinating committees have tended to be 
ineffective. If they are to work well, some provision must be made for giving 
more weight than is usual to considerations of general university interests, 
which often seem to conflict, at least temporarily, with the interests and 
ambitions of individual departments. Perhaps a statistical laboratory, 
charged with serving all departments in connection with instruction and re- 
search could exercise the necessary unifying influence on such a committee. 

Twelve specific recommendations in a fina! section of the report do much 
more than summarize what has gone before; some crystallize implications 
of previous conclusions, and some, quite properly, express judgments of the 
committee on points which it has not undertaken to discuss. 


A Chapter in Population Sampling. Prepared by the Sampling Staff of the Bureau 
of the Census: Morris H. Hansen, William N. Hurwitz, W. Edwards Deming, 
Benjamin J. Tepping, and Harold Nisselson. (Bureau of Census, Washington, 
D. C.) Washington 25, D. C.: Government Printing Office, 1947. Pp. v, 141. 
$1.00. 
Review sy 8S. Lee Crump 
Assistant Professor, Statistical Laboratory, Iowa State College 


5 ye monograph reviewed here is a welcome addition to the somewhat 
skimpy literature on the theory and practice of the design of samplings. 
The decision, stated in the introduction, to publish the work “without 
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waiting for the incorporation of further research and application” seems to 
this writer a commendable one. 

Sample censuses to estimate the total population and characteristics of the 
population of several congested production areas in the U. S. in 1944 provide 
the illustrative material for the discussion. The book is divided into two 
parts. Part I describes the origin of the censuses and the desired precision 
of the results in the first few paragraphs, and then moves on to a detailed 
discussion of the choice of the sample design with regard to the requirements 
on the results and the available resources. The importance of balancing the- 
oretical advantages against practical limitations is emphasized throughout. 
This part of the discussion provides a good illustration of the kind of thinking 
that must go into the planning of a successful survey. 

In Part II, the theory involved in reaching the decisions and results de- 
scribed in Part I is developed systematically. This theoretical development 
is presented in a detailed step-by-step fashion and an alternative proof is 
given in one instance. Formulas for estimating the population and the 
variance of the estimated population are derived. The problem of minimizing 
the total cost subject to the desired precision in the estimate is thoroughly 
discussed. Finally the preparation of advance estimates of the variances 
involved and the determination of the sampling error of the final population 
estimates are discussed. Numerical illustrations from the censuses are pre- 
sented where they are most helpful. This reviewer was particularly impressed 
with the discussions of the validity of the assumptions made and the errors 
introduced by the approximations used. Such discussions are very helpful 
to the uninitiated but are frequently missing. 

Four appendices are given. The first gives excerpts from the instructions 
to the listers and supervisors engaged in taking the censuses. The second 
shows certain forms and schedules used in the censuses, while the last is an 
example of the published tables of the census results. The third appendix is 
a “memorandum on the formula for inflation” (inflation of the sample results 
to estimates of the total population). It appears that this material might 
more appropriately have been included in the body of the text. 

In conclusion it may be remarked that this little book seems to be adapt- 
able for use as a text in part of a second-quarter course on the theory of the 
design of samplings. 


Des Mouvements Economiques Généraux. Léon H. Dupriez (Professeur a I’Uni- 
versité de Louvain). Louvain, Belgium: Institut de Recherches économiques et 
Sociales, de |’ Université de Louvain (2 rue des Doyens), 1947. Two volumes. Pp. 
xi, 552; 648. Paper. 
REVIEW BY Stmon KvzZNETS 
Professor of Economics and Social Statistics, University of Pennsylvania 


T: movements referred to in the title comprise secular trends, long cycles 
of a duration of roughly 50 years, and business cycles. The discussion is 
mostly in verbal terms. Chapters that can be classified as statistical (7 and 
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8 in Vol. 1 and 13 and 14 in Vol. 2) account for less than one third of the 
text. However, the contents of the volumes are enriched by appendices 
(about 125 pages) which provide specific references to sources; the series 
proper when calculated for use in the volumes or not published elsewhere; 
and the rates of change and other measures employed in the text. 

Statistical treatment is confined to secular movements and long cycles; 
the shorter term cycles are discussed almost without any quantitative data. 
The analysis of secular movements uses series on output (by major industry 
sectors and in a few individual industries) population, labor force and its 
apportionment among industrial branches, and output per worker and per 
equipment unit—all for Great Britain, the United States, Belgium, and 
more briefly for France and Germany; supplemented by a few series for 
countries that became industrialized more recently and indices of world 
output of a few crude materials. The technique, applied largely to series on 
total production, is to fit either logistic curves (of the three constants type 
to logs of data) or simple exponentials. In the treatment of long cycles, the 
emphasis is on prices and monetary totals. For the iew quantity series used 
in this connection, the technique is that of comparing rates of percentage 
change between distinct periods, and interpreting changes in these rates as 
evidence of the existence of long cycles. 

According to the author’s preface, the volumes were initiated in 1940 and 
completed in 1944. Written under difficult war conditions, and without 
access to the economic and statistical literature that has appeared since 
1940, the volumes are impressive in their scope and broad conception 
of the theme. But the quantitative analysis leaves much to be desired. Even 
accepting the almost complete absence of it in the discussion of business 
cycles, one is still disturbed by the sparse data and the elementary level of 
analysis in the sections dealing with secular movements and long cycles. The 
observations below indicate a few major shortcomings. 

One difficulty in analyzing secular movements lies in securing comprehen- 
sive and articulated measures of the changes in the total eccnomic perform- 
ance of nations over a sufficiently long period. Unlike fluctuations involved 
in business cycles, in which substantial synchronism may be assumed, secular 
movements differ widely in magnitude and often in direction among several 
sectors of a national economy. Indeed, to a large extent, growth of one sector 
is at the expense of another. In analyzing the general problem of economic 
change of the long run, it is, therefore, important to have at hand compre- 
hensive measures, with an adequate distinction of parts. Such measures are 
available for a number of industrial countries, in the form of national income 
series bearing on total production, and of national wealth series, bearing on 
total stocks of resources. Yet there is no attempt in Professor Dupriez’s 
book to assemble or analyze these measures. One is inclined to think that the 
yield of such an attempt might well have been rich; and that, without the 
frame of reference provided by these comprehensive measures, much of the 
specific treatment hangs in the air. 
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In analyzing long cycles or in testing their existence or absence, the sta- 
tistical technique is of crucial importance; as is a theoretical base that 
would firmly establish the mechanism that produces cyclicity over the span 
indicated. The use of simple rates of percentage change is particularly 
dangerous, unless the periods distinguished are free from imbalance of the 
shorter term cycles comprised within each. If they are affected by such 
imbalance, differences in rate of change between successive periods are no 
evidence of long cycles but may merely be reflections of the fact that in one 
period the imbalance of short cycles is of a different sign or magnitude 
from that in another period. Inspection of Professor Dupriez’s charts in 
Chapter 14 shows that his technique has not escaped this danger. With- 
out denying the fact that long swings (not necessarily periodic) exist in 
price levels and in the supply of money, the reviewer cannot accept Pro- 
fessor Dupriez’s discussion as having contributed to the establishment of 
either the mechanism that would produce roughly cyclical swings of that 
long duration or of the existence of such swings in real volume of economic 
activity. 

Some other lacunae strike the eye. No reference is made to Paul H. Doug- 
las’ work on output and productivity—a serious omission in any analysis 
that tries to impute growth of output to the several productive factors. No 
mention is made of Colin Clark’s articles and books, although some of them 
appeared before 1940 and the compendium on Conditions of Economic Prog- 
ress was issued in that year. While most of the German publications in the 
field are utilized, that by Rolf Wagenfihr on growth of industrial production 
(by countries, on world scale) is neither noted, nor used. 

If one keeps in mind that the two volumes present only a partial survey 
and analysis of data and problems in the field, they can be used as a helpful 
summary of the statistical record for a few countries; and as a guide to some 
questions which even the limited past work in this field answers or suggests. 


Housing in the Cleveland Community: Past-Present-Future. Howard Whipple 
Green (Director, Real Property Inventory of Metropolitan Cleveland). Cleve- 
land 15, Ohio: Real Property Inventory of Metropolitan Cleveland, 1947. Pp. ii, 
25. Paper. $5.00. 


Review By S. Morris LivincsTon 
Chief, National Economics Division, Department of Commerce 
Washington, D. C. 


em is a study of the housing of Cuyahoga County decade by decade since 
1810. Its chief contribution is the development of a morte lity curve based 
on this experience of over a century. Projections to the year 2000 serve to 
emphasize the increasing importance of replacement demand relative to fur- 
ther growth in families. It should be of interest to any one concerned with 
the housing niarket as well as to those dealing particularly with the Cleve- 
land area. 
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Since the estimates for earlier years »re based entirely on the census of 
- population they are necessarily a tissue of assumptions as to family size, va- 
cancy ratios and demolition rates. Potential errors in these assumptions, how- 
ever, probably do not affect the basic conclusions. 

The estimates of the family units created in each decade which were still 
standing in 1940 involve a substantial smoothing of the curve. This is nec- 
essary because of the aberrations in the 1940 reports of dwelling units, by 
year built, to both the Cleveland Real Property Survey and the Census of 
Housing. This smoothing is legitimate if the purpose is to derive a mortality 
curve. It is not, however, a realistic basis for estimating actual demolitions, 
decade by decade, over the last century. 

The result, for example, is an estimate of demolitions in the thirties 50 per 
cent higher than in the twenties. Since many demolitions are to clear sites for 
other more intensive use, one would expect them to decline rather than in- 
crease with the decline in construction activity. Such estimates as there are 
for the country as a whole for the two decades tend to support this view. The 
reader may also question the conclusion, as the result of this smoothing, that 
all of the dwelling units created since 1900 are still standing. Some of them 
must have been destroyed by fire or demolished to make way for other struc- 
tures. 

While the mortality curve thus constructed may surprise some people in 
its implications of longevity, it can be argued that, in one sense, it actually 
understates the experience of the last century. Comparatively few dwelling 
units have been torn down simply because they are worn out. That it is the 
older dwellings which usually stand in the way of new commercial or multi- 
dwelling developments is largely coincidental. 

It is possible, however, that the report underestimates the importance of 
future replacement demand. The preoccupation with families and dwelling 
units as measures of housing demand and supply ignores the potentially im- 
portant shifts in demand by price class. In the thirties, for example, the 
population continued to grow but the income per family dropped sharply. 
The increase in effective demand was all concentrated at the lower end of 
the price scale. Hence the continued occupancy of the less desirable units 
and the limited volume of new construction. In sharp contrast is the experi- 
ence during and since the war. The excess of demand over supply, with the 
resulting sharp increase in prices, has been due less to the growth in families 
than to the increase in incomes. 

Any future increase in buying power, such as might result from lowered 
construction costs due to prefabrication, could create a demand for addi- 
tional housing disproportionate to any historical mortality curve. By making 
it possible for pecple to afford better housing it would reduce the demand for 
the least desirable units and thereby force their elimination and replacement. 

In general the report does an excellent job of simplified presentation for 
the nontechnical reader. The almost total lack of footnotes indicating sources 
may be justified on this ground. Some of the headings of charts and tables 
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however, are so ambiguous as to be difficult to understand. There is, for ex- 
ample, the “Per Cent of Family Units Standing in Each Decade Demolished 
During a 10-Year Period.” By comparing the data under this heading with 
those in an appendix table, this reader finally concluded that they were the 
units created in each decade which were still standing in 1930 and demolished 
during the 10-year period, 1930-1939. 


The Population of the Soviet Union: History and Prospects. Frank Lorimer 
(Office of Population Research, Princeton University). League of Nations, Eco- 
nomic, Financial and Transit Department, Series of League of Nations Publica- 
tions, II, Economic and Financial, 1946, II, A, 3. New York: Columbia Univer- 
sity Press, 1946. Pp. xiv, 289. $4.00 (London W.C. 1: George Allen & Unwin, Ltd. 
{Ruskin House, 40 Museum St.]. Cloth, 17s. 6d.; paper, 15s.) 


Review By P. K. WHELPTON 
Associate Director, The Scripps Foundation for Research in 
Population Problems, Miami University 


N VIEW of the important position in world affairs which the USSR is be- 
| pl to occupy, it is highly desirable that more factual information 
about its population be made available in English. Many articles and books 
have been written which present a traveler’s impressions gained during a 
short journey and strongly influenced by his personal biases. Lorimer’s book 
is the only one known to the reviewer which has been prepared by com- 
petent and impartial research workers and is based primarily on a careful 
examination of a vast amount of technical material prepared by Soviet 
statisticians and scientists. For this reason it should be extremely useful to 
persons who want authoritative information about the demography of the 
Soviet Union. 

An important part of the book relates to the size, composition (sex, age, 
ethnic group, occupation, etc.) and distribution of the population in 1897, 
1926 and 1939 (the years for which census data are available) and to the 
changes which took place during the intervening years. Considerable atten- 
tion is given also to an evaluation of fertility and mortality during certain 
years and among different groups (mostly geographic and ethnic). Migra- 
tion between areas and between the rural and urban portions of areas is 
another important topic. The final chapter relates to the factors affecting 
fertility and mortality trends, the changes in the size and composition of the 
population which will occur in the future if fertility and mortality follow 
certain specified trends (with an allowance for the effects of the war), and 
the implications of these changes. The discussion of demographic matters is 
given an appropriate background by an early chapter dealing with the 
economic structure of the Russian Empire, and sections in later chapters 
relating to natural resources, industrial production, the reorganization of 
agriculture, and similar topics. 

Because of changes in boundaries, in the wording of questions, and in 
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definitions of terms from census to census, much of the data from different 
censuses could not be compared directly, but only after they were adjusted in 
various ways. Similarly, because vital statistics were available for most of 
the nation during only a few years and for parts of it during some (but not 
all) of the remaining years, it was necessary to make various estimates in 
order to describe the events during the pericds under consideration. The 
basis of estimating is described carefully in the appendix; the fact that most 
of the estimates may give only rough approximations is emphasized re- 
peatedly in the text. The estimates of the losses due to World War I, the 
Revolution, and the pest war famine, and the estimates of natural increase 
year by year from 1929 to 1939, are of special interest. 

A competent reviewer should be familiar with the material which ought to 
be examined in preparing a book of this type, and able to express a worth- 
while judgment on the extent to which it has been examined and the accu- 
racy with which it has been digested. The present reviewer does not meet 
these requirements. He is impressed, however, by the number of titles (512) 
in the bibliography and the extent to which they have been used in the text. 
To do more he would need the assistance of someone as capable as Dr. 
Gordon, who worked on the material in the Russian language fer Dr. Lor- 
imer. 

At certain points the situation in the Soviet Union is compared with that in 
the United States. Demographers will be able to make such comparisons 
without too much trouble at other places where they would be helpful. In 
this reviewer’s opinion, it would have been desirable from the standpoint of 
other American readers if additional comparisons of a similar nature had 
been included. 

The book has many maps and figures, which are excellent. It is unusually 
free from minor errors of the type which this reviewer notices on careful 
reading. 

The author and his colleagues, Princeton University’s Office of Popuiation 
Research, the League of Nations, the Milbank Memorial Fund, and the 
Carnegie Corporation are to be congratulated for the part they played in 
making possible the publication of this book. 


An Introduction to Business Statistics, Second Edition. John R. Stockton (Pro- 
fessor of Business Statistics, The University of Texas). Boston 16, Mass.: D. C. 
Heath & Co. (285 Columbus Ave.), 1947. Pp. ix, 478. $4.00. 


REVIEW BY ANTHONY J. NEsTI 
Chief Statistician, National Electrical Manufacturers Association 
155 East 44th St., New York City 


T WAS indeed an unus"al experience for the reviewer to critically appraise 
Professor Stockton’s -econd edition of An Introduction to Business Statis- 
tics from the standpoint of actual experience in the field of business statistics 
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after having plodded through the first edition of the book as a student 
some years ago. It must be confessed that more mature judgment extracted 
more benefits from the review than was the case in the first reading of this 
work. 

Professor Stockton has attempted to meet a great need in directing his 
book to prospective executives rather than to prospective statisticians. It 
would have been well to indicate in the preface that the book could serve, 
not only the student who is aspiring te be a business executive, but also 
the man who is already a business executive but who wishes to attain a 
working knowledge of the simpler statistical methods and analyses and their 
applications. 

There is a sad lack of understanding of statistics among the great bulk of 
executives in the business world today. There is, therefore, need for a book 
such as this which is intended to “emphasize the practical value of statistical 
analysis and the use that the businessman can make of statistical methods.” 
The words “practical value” are very important as a key to the reader’s 
interest and hence this review has been based entirely upon how well the 
author has brought out the “practical” value of statistical analysis and the 
“practical” use that the businessman can make of statistical methods. 

On the whole, the author has done a splendid job of first, assembling those 
statistical subjects which are of most interest to the businessman, next, of 
treating these subjects in very logical sequence, and finally, in using, in the 
treatment, language which makes the subject matter easy to follow and easy 
to understand. The first thirteen chapters represent a fine handling of such 
subjects as the need for statistics in business, statistical tables, internal 
records, external data, charts, averages, time series, seasonal variation, 
cyclical fluctuations, etc. It is only in the last four chapters on index num- 
bers, business barometers and correlation, that some improvement in the 
treatment of the subject might have been desirable. This will be brought 
out in the detailed criticisms which will follow. 

In Appendix A, the author has supplied an excelient list of problems which 
are really representative of the problems actually encountered in business. 
It is very strongly recommended that the student of the subject spend con- 
siderable time in working out all of the problems, making a permanent 
record of the solutions so that they may be used as reference material in 
later years when the same or similar problems are encountered in actual busi- 
ness. One observation which might be made in connection with the prob- 
lems, is that Problem 4 on Machine Tabulation expects too much from the 
student on the basis of the material presented on the subject in this particu- 
lar volume. 

Other appendices provide for ready reference and use statistical formulas, 
glossary of symbols, tables of square, square roots and reciprocals and 
tables of 5-place logarithms. All of this material is invaluable to the student 
and saves a great deal of time in the study of the text. 

The comments which follow apply to each of the specific subjects covered 
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by the text, and are intended to center attention on the more practical] 
applications cf the subject to actual business problems. 

Statistics in Business: The author has made several important statements 
which might be emphasized by the instructor who will transmit the subject 
to the student. These are that “the limitations of statistical methods as well 
as their value” should be recognized and “the ability to organize data into a 
compact logical table must be acquired by anyone who deals with statistical 
data.” 

However, in developing the need for statistical methods and analysis in 
present-day business, the author might have made a stronger case by draw- 
ing much more heavily on the experience of business during World War II 
and by referring to: (a) the highly competitive nature of modern business; 
(b) the growing importance of labor problems and the need for statistical in- 
formation in union negotiations; (c) the increasing tendency for diversifica- 
tion in products produced by a single company; and (d) the established im- 
portance of statistics in all national emergencies. 

In dealing with the subject of recording data, the real purpose should be 
mentioned. The proper recording of statistical information js not only nec- 
essary “so that it will be readily available when needed,” but, more impor- 
tant, it is necessary so that it will be of maximum value in meeting problems 
which arise. 

In the use of external data, the student should be taught immediately to 
scrutinize very carefully such data as are already available in order to deter- 
mine their limitations and applicability to the problem in hand. Further a 
great service would have been performed if the author in discussing external 
data would have brought out the responsibility which each business, or each 
industry, has in helping to improve the so-called “external data.” Many of 
such data are directly dependent upon business for clarity and completeness. 

Statistical Tables: From the reviewer’s point of view, the chapter on Sta- 
tistical Tables is the most important, and incidentally the best-written, 
chapter in the book for the business executive or for the prospective business 
executive—for that matter, even for the statistician. The importance, in any 
business, of properly tabulating statistical information, either for the records 
or for immediate analysis cannot be emphasized too greatly. The author has 
done an excellent job of covering situations which actually occur in everyday 
business, particularly in his treatment o: construction of tables. One im- 
portant point which is not covered, however, is the importance of indicating 
on the table the limitations of the data such as product definition, geographi- 
cal coverage, industry coverage, etc. Another comment which might be 
made concerns the statement (p. 33), “For geographical classification, there 
is no such natural order of arrangement as there is for time series and fre- 
quency distributions.” It would seem that a classification of national data by 
“regions” (e.g., northeastern, east central, south, etc.), by “states,” by 
“counties,” or by “trading areas,” might be termed “Natural Classifications.” 
Internal Siatistics—Records and Reports: In discussing the problem of col- 
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lecting internal data, the author states that (pp. 42-43) “by far the most 
satisfactory method of collecting internal data is to know what information is 
wanted and make the necessary records as the facts become known.” 

It is more important, actually, to anticipate the kind of information which 
may be required in the future and to collect and tabulate such information 
in advance of the need. This, admittedly, is a very difficult task yet it is one 
of the important and necessary functions of the executive. In developing this 
point, some examples might be included in order to provide assistance on 
methods of determining the type of information which might be needed. 

Published Data: It might be recommended to the student that he accept 
the following statements with a “grain of salt”: “no matter what problem the 
businessman has under consideration, it is likely that information of im- 
portance in its solution is available from some source” (p. 56). (This is too 
often not the case.) Also: “the collection of data of general interest to the 
public can, in many circumstances, be better done by government than by 
private organizations” (p. 69). (One should be careful in accepting the infer- 
ence here, for it is also true that, in many instances, the collection of such 
data can best be done by private agencies and not by government.) 

Collection and Tabulation of External Data: In treating with this discussion 
of collection of external data, the author has of necessity gotten into the 
subject of sampling. Sampling techniques represent a very large and special 
field of study for the advanced statistician and can hardly be covered in part 
of one chapter of one book. Hence, it is not surprising that one leaves this 
part of Professor Stockton’s book with a feeling of uncertainty regarding 
sampling. One thought which is generated is that the results of a survey 
conducted on a sample basis are not dependable unless the survey has been 
conducted on a personal-interview basis to be sure that the “right” propor- 
tion of various strata are represented, or, in others words, that the sample 
has been “controlled” properly. The treatment has placed too much empha- 
sis on controlling the sample. This is dangerous because it leads to the con- 
trolling of the results. 

Statistical Analysis: On pages 125 and 126 the author discusses the subject 
of “forcing percentages to total 100” when a number of items are expressed 
as percentages of the total. Several alternatives are presented, namely: 
(a) The total may be written 99.99 representing the actual total of the per- 
centages. (b) The total may be written 100.00 even though the column ac- 
tually totals, for example, 99.99 or 100.02. (c) The figures may be forced to 
total 100.00 by increasing or decreasing one or two percentages to bring the 
sum to exactly 100.00. 

The presentation and explanation of all three methods creates a very con- 
fusing problem for the student. In fact, this is a prime example of how text- 
books sometime make their subject matter unnecessarily difficult to under- 
stand. In practical business there is no question about the total being ex- 
pressed in terms of 100.00 and no question about making certain that 
the individual percentages do total exactly 100.00, with the choice being 









































648 AMERICAN STATISTICAL ASSOCIATION 


whether it is desirable to extend the number of decimal places to one, two 
or more. Accordingly, since the book is intended for business executives, 
only method c above need be recommended and explained. 

Charts: Since charts play such an important part in the presentation and 
analysis of business statistics, it would be well for the student to discount 
immediately such charts as those of the two and three dimension type 
(Charts 6, 7, 9 and 10). Such charts are not at all clear and when used tend to 
raise unnecessary criticisms of the arguments presented. Further, the chart 
employing large dots to illustrate the geographic distribution is also con- 
fusing (Chart 19) and should not be employed. It would be preferable to use 
a variation in cross-hatching, a variation in colors or a variation in shades of 
the same color. The latter method, incidentally, is an alternative for cross- 
hatching in meeting the problem mentioned by the author on page 153. 

In presenting bar charts, the author has suggested horizontal bars for 
series other than time series indicating that vertical bars are always used in 
charts of time series. From the standpoint of clear presentation of com- 
parisons, one fails to understand the distinction made by the author. Al- 
most in all cases, vertical bars are preferable. Further, the bar chart, either 
the single bar or a series of bars and for that matter also the pie chart, would 
be much more effective if cross-hatching or shading were employed. 

Averages: The author has again done an excellent job in providing a clear 
description of two of the most used averages in business, namely, the arith- 
metic average and the median. The chapter could very well have omitted 
such items as the short method of computing the arithmetic mean (pp. 162- 
163) and the graphic interpolation of the mode. These methods are so 
seldom used in actual practice—at least in business statistics—that their 
elimination from the discussion of averages would do no harm but would 
rather simplify the understanding of the balance of the subject matter. 

The author makes an interesting observation when he states (p. 160) that 
“the best known of the averages is the arithmetic mean, usually referred to 
by public as the ‘average.’” This statement is, unfortunately, too true. In 
its truth lies a paradox in that it is difficult to sell business or the public in 
general the idea of employing or accepting any other type of average even 
though in many, many instances the arithmetic average is not the best aver- 
age. Too often the “median,” even though a better average, meets with 
strong opposition to its acceptance in lieu of the arithmetic mean. 

Dispersion: The only observation which the reviewer has to make in con- 
nection with this subject is that, perhaps, for most business uses, it would 
have been sufficient to limit the treatment of dispersion to about the first 
half of the chapter, that is, to standard deviation. The balance might be 
left to other books which interested students might study in more advanced 
courses in statistics. 

Time Series: The author has made a very lucid presentation of the 
subject of time series. Pages 222 and 223 on comparison of time series is of 
particular interest to the businessman and it would be wise for the student 
to study them carefully. 
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Secular Trend: In illustrating the graphic method of determining the 
trend, the author states (p. 237), “the trend line was drawn where it ap- 
pears it should go, the judgment of the individual drawing the line being the 
criterion used to locate it.” It might have been pointed out that such judg- 
ment should be assisted by the mechanical process of equalizing the areas 
above and below the trend line. In suggesting the method of moving aver- 
ages, it might be pointed out that the results simply provide a curve which 
decreases the fluctuations in the original curve and that what might be done 
next is to apply the graphic method to the five-year average curve in order 
to obtain a trend line. It is the reviewer’s opinion that the graphic method of 
determining trend is the most practical for, as the author states on page 258, 
“it is difficult to get away from the goodness of fit as observed on a graph as 
the basis for deciding on a trend line.” The methods of semi-averages or 
moving averages do not prove too satisfactory. For more difficult trend 
problems, the method of least squares is undoubtedly superior. 

Seasonal Variation: This subject is very well presented. The student, 
however, might be cautioned more strongly not to relate all seasonal fluctua- 
tions to the effect of the seasons themselves whether they be spring, summer, 
autumn and winter or whether they be the individual months of the year. 
The seasonal pattern should also be related to certain economic factors which 
happen to occur during particular seasons. 

Cylical Fluctuations in Time Series: The author seems to have found it 
necessary “o include in an introduction to this subject his theory as to the 
causes for and remedies for cyclical fluctuations in business. Since the 
thoughts expressed are definitely economic in nature and since they are sub- 
ject to much controversy, would it not be better to simply point out that 
cyclical fluctuations do exist and then proceed to indicate how they may be 
measured? This procedure secms to be logical particularly since the author 
makes clear in his final sentence (p. 333) “this present chapter has been 
devoted completely to measuring the cyclical fluctuations in business activity 
and offers no procedures by which those fluctuations can be predicted. 

Construction of Index Numbers: This chapter strikes the reviewer as being 
wholly inadequate for the business man’s use. Too much of the discussion 
concerns price fluctuations. The text does not even mention the most im- 
portant use of the index number idea in business, namely, to obtain and 
maintain a comparable trend of current data where data does not in itself 
remain comparable. Nor does the author treat with the wide use of index 
numbers in correlation work, nor with the ease of comparing, through the 
use of index numbers, unlike, but related, time series. More material might 
well have been presented on “chain” indexing, pointing out the upward or 
downward bias which might be created and also how such a bias can be cor- 
rected at regular intervals so that the “chain” idea need not be sacrificed. 
Further, the author might have developed, more than he did, the idea of us- 
ing a number of years as a base period and the necessity for changing the 
base period from time to time. 

Business Barometers: The author points out that “the purpose of this 
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chapter is to outline certain basic series on general business conditions that 
every businessman should know about” and that “it is not intended as ¢ 
complete list but rather as a selected group of particularly significant ones.” 
However, there are many series which are not mentioned and which are 
more important than those cited in the chapter and much more important 
to many branches of business. The chapter could either have been expanded 
to cover some of these additional important current series or some of the 
latter could well have been substituted for those given in the text. For 
example, no mention is made cf regular releases distributed by the Bureau 
of the Census on manufacturing, distribution, and servicing trades. No men- 
tion is made of the important releases made by the Construction Division 
of the Bureau of Foreign and Domestic Commerce. 

Correlation: The criticism which should be directed toward this chapter 
has to do with the impressions which are created by the example employed 
and the conclusions drawn therefrom. For example, the chart showing the 
relation of gasoline consumption by states to motor vehicle registration by 
states is a very good illustration of close correlation between two series. 
However, it would be obvious to any individual that such a correlation would 
exist between these two particular series. But the author concludes “indicates 
that there is a high degree,” etc., rather than concluding “indicates the ex- 
pected high degree,” etc. Further, in plotting the same correlation on a 
logarithm chart the author concludes that “the correlation is shown to be 
equally high when this type of chart is used.” For these two series, it is ex- 
pected that the correlation would be high no matter what type of chart is 
employed. 

Another important criticism is that the famous “line of average relation- 
ship” is introduced suddenly by the author, without warning, and without 
first explaining what a “line of average relationship” is or how it is drawn 
(p. 389). 

Measurement of Correlation: Again in this chapter a more complete expla- 
nation of the “line of average relationship” would have been desirable. In the 
case of the method for obtaining the formula for a particular “line of average 
relationship,” the student will need to accept the hypothesis of the value of 
“a” and “b” but it should be made clear that such hypothesis is developed 
further in advanced statistical work. 

The statement “estimates may logically be made from the regression equa- 
tion only within the limitation of the data on which the computation of the 
equation was based” is not strictly true. Some advantage can be taken pro- 
viding it is reasonable. In describing the coefficient of correlation, the 
author indicates that “the nearer r is to 1.00 the higher the degree of correla- 
tion; the smaller r is, the less the correlation. When r equals zero, there is 
no correlation.” The first part of this statement is correct. However, one does 
not have to reach zero to know that there is no correlation. If the coefficient 
of correlation is less than .60 or .50, one would certainly conclude that in all 
practicality there is no correlation. 
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For the purpose of this text it does not appear necessary to include a dis- 
cussion of “Alternative Methods of Computation.” 

A final general observation which should be made has to do with the vari- 
ous charts and tables presented throughout the book. Many of these charts 
and tables fail to follow the very concepts that the author is trying to teach 
the student. For example, Chart 92 on “Indexes of Total Building Con- 
struction Awarded and Residential Building Construction Awarded,” indi- 
cates only the source of data and the source cited is the secondary source— 
the Federal Reserve Bulletin. More important, this chart should make it 
clear that the index curves are based upon estimated figures on construction 
prepared by the F. W. Dodge Corp. which means that the indexes have very 
specific limitations. 

Further, the charts and tables should be presented closer to the subject 
matter so that the student’s trend of thought is not disrupted by having to 
go back 100 pages or forward half a dozen pages for the chart or table in ques- 
tion. The situation can be improved by repeating some of the same charts 
and/or tables at appropriate places in the book. 

In conclusion, the reviewer wishes to acknowledge that it is much more 
simple to criticize a publication such as Professor Stockton’s book, than it is 
to originally prepare such a volume. It is hoped, however, that some of the 
foregoing uv: nments may prove constructive for those who may be inter- 
ested in the actual application of An Introduction to Business Statistics to 
everyday business problems. 


Multiple-Factcr Analysis: A Development and Expansion of The Vectors of 
Mind. L. L. Thurstone (Professor of Psychology, University of Chicago). Chicago 
37, Ill.: University of Chicago Press (5750 Ellis Avenue), 1947. Pp. xix, 535. $7.50 
(London N.W. 1: Cambridge University Press [Bentley House, Euston Rd.]. 
42s. Two reviews follow: 


REVIEW BY LouIs GUTTMAN 
Associate Professor of Sociology, Cornell University 


WELVE years ago, Thurstone wrote The Vectors of Mind as a textbook in 
Thich was presented his approach to multiple-factor analysis as a gen- 
eralization of the single-factor theory of Charles Spearman, the English 
psychologist. The psychological problem that interested Spearman was the 
notion of general intelligence. He hypothesized that the intercorrelations be- 
tween scores obtained on various kinds of achievement tests could be ac- 
counted for by a single common factor, and the residual variances by a 
unique factor for each test. The algebra for analyzing data according to this 
hypothesis was developed by him, pivoting on tetrad differences. Research 
by several investigators found that the differences did not vanish according 
to Spearman’s single common factor theory. Various alternative hypotheses 
of multiple factors were then propounded, and that of Thurstone appears 
to be dominant at the present time, at least among American psychologists. 
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Discarding the idea of general intelligence, the more informed investigators 
of today seek to uncover different kinds of intelligence which may or may not 
be related to each other. 

The present volume is an extended and enlarged version of Thurstone’s 
earlier textbook. Almost all of the previous material has been retained, al- 
though in amplified form for the most part. Opening with a brief statement 
of some of the principal notions and theorems of matrix algebra, the factor 
problem and its algebraic formulation is stated in terms of common and 
unigue factors. The simple-structure concept, wherein each variable has 
zero loadings on as many common factors as possible, is emphasized as leading 
to a meaningful solution, and the centroid technique is again presented as 
among the most convenient to use for the initial calculations. More attention 
is given this time to the problem of communalities, to principal axes, geo- 
metrical models, and the varieties of configurations that are possible for a 
battery of variables. 

Among the innovations in the present volume is an exposition of the group 
method of factoring and of the method of extended vectors, an investigation 
of factorial invariance and the effect of selection of people, and the concept 
of second-order factors. Again a brief treatment is allotted to the problem 
of estimation of individual scores on the factors. The aid of mathematical 
statisticians is urgently solicited for solving the virtually unexplored prob- 
lems of sampling error. The mathematical treatment throughout is with a 
minimum of rigor, although actual proofs can undoubtedly be supplied for 
the most part. 

While the factor problem originated in connection with psychological data, 
almost no psychology will be found discussed in the volume under review. 
The difference in titles between the first edition and its present successor 
seems to reflect a difference in emphasis that Thurstone wishes to make. 
He regards multiple-factor analysis to be a general scientific tool which may 
be appropriate in many different sciences apart from psychology. In any dis- 
cipline where the interrelationships between many variables are to be studied 
—especially where fundamental concepts are still being sought—it is sug- 
gested that an exploration by the techniques of multiple-factor analysis may 
show how to construct a meaningful frame of reference for the variables. 

Thurstone’s work already has had a tremendous influence in the field of 
psychological measurement. The trail he has blazed is being followed by 
many, and the present book will make it even easier to follow. If a student 
is given a table of the intercorrelations between a set of variables, he now has 
a clear and relatively simple guide to follow for a series of numerical com- 
putations. However, it seems that some of the basic methodological problems 
which existed a dozen years ago still remain. There are still polemics concern- 
ing rotation of axes and the naming of the common factors. There is also the 
problem of the relationship of factor analysis to the prediction of outside 
variables, which involves the unique factors as well as the common factors. 

There is no problem of rotation for Spearman’s theory; the rotational ques- 
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tion arises when two or more common factors are found. There is no alge- 
braically unique solution for reproducing correlation coefficients from two or 
more common factors. The problem of external prediction does affect the 
single-factor pattern as well as the multiple-factor. It is intended in the 
remainder of this review to emphasize these problems and to present alter- 
native suggestions which may indicate that progress to a solution may lie in 
other directions from these common factor approaches. 

It is interesting to notice that Thurstone begins in a rather different way 
from Spearman. Spearman’s hypothesis was cast in terms of partial correla- 
tion, and can be phrased as follows. Let S,, S:, ---, S, be the n statistical 
variables whose intercorrelations are to be factored. His hypothesis was that 
there existed another statistical variable, say z, such that r.,.,.2 vanishes for 
all j and k (jk). From this it follows that, if a; is the correlation between 
the jth variable and the common factor z, then 7;., =a) (j ¥k). It is this 
last relationship that makes tetrads, or second-order minors of the matrix 
of intercorrelations, vanish if the single factor hypothesis is correct. In this 
case, the a; are uniquely determinable and there is but one solution to the 
problem. 

Investigations with psychological data have not found the single-factor 
hypothesis adequate for mental testing. Thurstone’s approach has been to 
consider a set of more than one common factor, say 2, 22, +++, 2. His 
hypothesis is equivalent to the following. If all the common factors are held 
constant, then the partial correlations between the original variabies vanish. 
That is, Thurstone’s problem is to find a set of variables x; such that 
Tajep-2)2---2, Vanishes for 7 ~k. Thurstone, however, does not state this di- 
rectly in terms of partial correlation, but in an equivalent manner. Either 
way, one arrives at the result that, if the common factors are uncorrelated, 
and if aj; is the correlation between the jth observed variable and the /th 
common factor, then 


Ve ;2% = AjiGx1 “+Qja0ne + shine +4 jrAke (j #k). (1) 


The value on the right hand side for the case where j =k is called the com- 
munality of the jth test. 

A difficulty with the multiple-factors lies in that the a;; are not uniquely 
determinable. There are infinitely many sets of them which will satisfy the 
conditions for vanishing partial correlations. Any orthogonal trans- 
formation of the common factor scores will yield a new set of common 
factors which will satisfy equation (1). To resolve this indeterminacy, Thur- 
stone imposes a certain criterion, namely, that certain of the a;; vanish in 
order to form what he calls a simple structure. (He does not restrict the com- 
mon factors to be orthogonal for this; the right member of (1) becomes 
modified for intercorrelated common factors.) Other investigators have tried 
to impose other conditions on the a;:. 

It is this reviewer’s belief that the hypothesis of multiple-factors in the 
case where Spearman’s is rejected may often not be the best alternative. In- 
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stead of proceeding to multiple-factor analysis, one might investigate other 
possible single factor patterns. For example, consider the following alterna- 
tive to Spearman’s hypothesis. Instead of holding a common factor constant 
and specifying that the intercorrelations among the s; vanish, let us hypothe- 
size that there exists an order among the original variables such that the 
following partial correlations vanish: r,,2..,= 0(j >). This is also a single 
factor hypothesis, and it leads to the following restrictions on the original 
variables. If a; is again the correlation of s; with z, then this new single fac- 
tor hypothesis leads to the relationships: 


aj ‘ . 
Taj =— (j2h). (2) 
ay 


Notice that this relationship holds for 7 =k, in which case both the left and 
right members are unity. 

If multiple-factor techniques were used on a matrix of intercorrelations 
satisfying (2), they would ordinarily reveal many common factors and not 
the single factor which is actually present. Furthermore, the use of the con- 
cept of communalities would destroy a remarkable feature of the principal 
axes of the correlation matrix of the n observed variables which have this 
new single factor pattern. It can be shown that the matrix is in general 
non-singular and therefore has n principal components. Furthermore, if ); 
is the correlation of the jth test with a particular principal component, then 
the 6; satisfies a second-order difference equation of the following type: 


A;(f;A;a;b;) +Agi4ib; +1 =(, (3) 


where the f; and g; are given non-negative functions and X is the reciprocal 
of the latent root of the correlation matrix to which the principal axes belong. 
It can further be shown that each of the n sets of b; which satisfy the above 
difference equatior (the boundary conditions are omitted here for brevity) 
are oscillatory functions of 7. In the limit, as n becomes infinite, the discrete 
solutions for the above difference equation become continuous functions 
which contain among them many classical orthogonal functions of mathe- 
matical physics. 

The general difference equation (3) was first discovered by this reviewer 
during his work on scale analysis of qualitative data, which is a special case 
of an alternative single factor theory in which restrictions are laid down not 
only on the intercorrelations between the variables but also on the individual 
scores to be obtained on the common factor. The reason that multiple-factor 
analysis as proposed by Thurstone and others will fail in this situation is that 
it is not equipped to handle the curvilinear relationships that result among 
correlation coefficients, even though only linear transformations of scores are 
involved. Also, the notion of communality is absent (except possibly for 
unreliability) because of the fact that equation (2) holds for 7 =k as well as 
for 7 #k, which is not true for either Spearman’s or Thurstone’s cases of 
equation (1). 
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There are other kinds of single-factor hypotheses which this reviewer has 
in mind, leading to other kinds of conditions which are of interest. In these 
single-factor theories, problems of rotation and the like do not arise because 
there is an algebraically unique solution for each. 

To state the factor problem in terms of specifications on what kinds of 
partial correlations should vanish also has the advantage of immediately 
generalizing into non-metric approaches. Thurstone prophesies, in his pref- 
ace, that non-metric theories will come in the future. There are already in 
existence two non-metric approaches to factor analysis to which it may be 
worth while calling attention here. The first is that of scale analysis already 
mentioned, and the second is that of latent attributes being developed by 
Paul F. Lazarsfeld at Columbia University. In each of these approaches, the 
notion of correlation is replaced by the notion of complete statistical inde- 
pendence. Statistical independence, of course, does not depend upon the 
metrics used, and also the variables may be qualitative as well as quantita- 
tive. Lazarsfeld’s non-metric single factor theory is at present limited to 
dichotomies, though it can be extended to any number of categories per 
variable. It states that a latent dichotomy can be found such that, if it is 
held constant, then the n observed dichotomies will be statistically indepen- 
dent. This leads to a tetrad condition like Spearman’s, but further conditions 
must also be satisfied for the hypothesis to hold. In scale analysis, the ob- 
served items are arranged in a certain order, and the relationship with the 
scale variable is observed for each item in turn, holding the next items con- 
stant. (Previous writings on scale analysis provide a different formulation, 
but it is equivalent to that just given here.) Scale analysis proceeds accord- 
ing to the alternative single factor theory described above, but in non-metric 
terms, and the latent dichotomy is in the spirit of Spearman’s theory. 

In any theory of factor analysis, the notion of a universe of tests or ob- 
served variables must be paramount. Any n tests used in a study are but a 
sample of a universe (and also are not ordinarily obtained by a random sam- 
pling process). It has been shown! that the number of tests must be indefi- 
nitely large in order to estimate perfectly the individual scores on the com- 
mon factors and on the unique factors. Thurstone suggests that in practice 
one would not want to use the multiple regression on all the tests in the 
battery, but rather try to devise some new tests which will measure the 
common factors more directly. While this may sometimes be possible, un- 
fortunately the implications of a universe of tests seem to have been over- 

looked when Thurstone goes on to introduce the concept of second-order 
factors. His multiple-factors are in general intercorrelated. Thurstone there- 
for suggests that the r X r matrix of these common factor intercorrelations 
be factored in turn, to yield second-order factors. From the theorem just re- 
ferred to, however, in order to obtain individual scores on such second- 
order factors, it is necessary in general to have an infinitely large number of 

1 Louis Guttman, “Multiple Rectilinear Prediction and the Resolution into Components,” Psycho- 
metrika 5: 75-99 Je 1940 
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first-order common factors. But the number of first-order factors is supposed 
to be small, in fact, very smal!. Hence, introducing second-order factors 
would seem to vitiate the basic idea of parsimony that underlies Thurstone’s 
approach. 

The communality of a test expresses but part of its variance, namely that 
associated with the common factors. From this arises another serious diffi- 
culty, with respect to prediction of external variables. In general, the com- 
mon factors of a battery will not predict as well as the original tests them- 
selves. If an external variable is related to any of the unique factors, then 
using only the common factors throws away predictive efficiency. It be- 
comes a real question, then, as to wherein the parsimony of factor analysis 
lies when it comes to external predictions. An interesting feature of the the- 
ory which leads to equation (2) is that the inverse of the correlation matrix 
is essentially expressed by the coefficients in (3) and hence has a particularly 
simple form. It is the inverse matrix that is involved in external predictions, 
and (3) gives some means of studying laws of formation of regression co- 
efficients. In particular, scales of qualitative data have important properties 
for external predictions that are discussed in the literature of that subject. 

Multiple-Factor Analysis, then, does not lead into new realms, but broad- 
ens the footsteps of its predecessor. It will be a more easily taught book, 
and hence should be even more useful in introducing the student to Thur- 
stone’s approach. It would be unfortunate, however, if certain algebraic 
routines became a substitute for thinking. It would be a setback for psychol- 
ogy—or any other discipline—if students were so drilled that, on being con- 
fronted with a table of correlations, they would automatically “multiple 
factor” it. While this is certainly not Thurstone’s intention, many of his fol- 
lowers—as do followers of any leader—seem to deviate from the intended 
path and become enmeshed in arithmetic to the exclusion of basic thinking. 

Thurstone has contributed perhaps more than anyone else to the opening 
of the eyes of psychologists and other social scientists to the world of multi- 
variate analysis. It is a subtle world, and perhaps not tractable to any single 
approach for examining its structure in different situations. 


Review By D. N. LAwLey 
Lecturer in Statistics, ‘University of Edinburgh 


N WRITING the present book the author’s original intention was, he tells us, 

to revise a previous work of his, The Vectors of Mind. However, so many 
alterations have been made and so much fresh material added that this 
must be regarded as a new book on the subject. It makes available the re- 
sults of work performed by Professor Thurstone and his colleagues at the 
University of Chicago during the last ten years. 

When a set of mental tests has been give to a group of individuals, a 
factorial analysis can be performed on the results, as the author remarks, 
“for one of two purposes, namely, (1) to condense the test scores by express- 
ing them in terms of a relatively small number of linearly independent fac- 
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tors or (2) to discover the underlying functional unities which operate to 
produce the test performances and to describe the individual differences 
eventually in terms of these distinguishable functions.” It is between the 
supporters of these two alternatives that much controversy has arisen in the 
past. While some workers in the field are content to regard factors purely as 
statistical concepts, others, among them the author, are concerned to give 
them a psychological interpretation. Certainly if this second object can in 
fact be achieved and a set of universally acknowledged “primary abilities” 
established, the utility of factor analysis will be greatly increased. Through- 
out the book, therefore, the emphasis is on methods designed to discover such 
primary abilities or factors. 

One of the many ingenious illustrations of the author’s methods is his 
application of them to data which we are subsequently told have been ob- 
tained from measurements of such things as cylinders or boxes. In the case 
of the boxes, for example, it is shown that one set of factors derived from the 
analysis can be identified with the dimensions of the boxes. Furthermore, 
the relation between the measurements and the factors is such that what 
is termed “simple structure” results. Hence if it is permissinle to regard 
the abilities of the human mind as in some way analogous to the dimensions 
of a box, we should expect that these abilities would be revealed by the meth- 
ods described. In this case, for box measurements we must substitute test 
scores. 

Hitherto psychologists have tended to confine themselves almost entirely 
to the use of orthogonal factors, that it to say factors which are uncor- 
related in the population or sample under examination. Professor Thurstone, 
hoy ver, does not restrict himself in this way but allows his factors to be 
obuyue, or correlated. This procedure may be justified by using once more 
the box analogy. When a collection of boxes, such as might arise in practice, 
is examined, it is found that those boxes which are tall tend also to be thick 
and wide. In other words the three dimensions are correlated. In the same 
way we might reasonably expect to ind correlation between the factors or 
dimensions of the mind. 

Throughout the book constant use is made of matrix algebra, a knowledge 
of which has now come to be regarded as essential for a complete under- 
standing of factor analysis. Geometrical reasoning and diagrams are also 
much employed. For this reason a mathematical introduction is given, which 
should be of great assistance to those whose mathematical training has been 
somewhat limited. A list of references to other mathematical works is also 
provided. The argument is frequently illustrated with numerical examples, 
and high praise is due for the lucidity and clarity with which the sometimes 
highly complicated computations are explained. 

Though here and there references are made to sampling errors and prob- 
lems of significance testing, the author has avoided any discussion of these 
topics. From a statistical point of view this is a pity. He says “a fair ap- 
praisal of the situation would probably acknowledge . . . that, while statis- 
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tical theory has made important advances in the last two decades, it has not 
advanced far enough to be immediately useful for factor analysis as a scien- 
tific method without still further advancement.” This statement is perhaps a 
little unfair. Under certain conditions a method is now available for testing 
how many common factors can be regarded as significant; though the method 
cannot of course tell us anything regarding the nature of the factors ob- 
tained, it does indicate how many dimensions are worth including in the 
common factor space. The point is important since in many investigations in 
the past more factors have been extracted than are warranted by the size 
of the data. The result has been tat psychological interpretations have been 
forced upon factors which are entirely spurious and represent nothing but 
sampling error. Since it appeais that in general the sampling errors of esti- 
mated factor loadings are rather large, one is led to wonder whether in deal- 
ing with experimental data it may not be a matter of some difficulty to de- 
tect simple structure even when it does in fact exist. 

Two variables are (p. 63) defined to be statistically independent when the 
correlation between them is zero. This definition is however at variance with 
the proper meaning of statistical independence, which is by no means identi- 
cal with or a necessary consequence of zero correlation. 

In spite of these criticisms the book should certainly be read by all those 
who are interested in the subject of factor analysis. Even though they may 
consider some sections of it to be controversial, they are sure to find the argu- 
ments put forward stimulating and such as to merit close attention and 
thought. 


Sequential Analysis. Abraham Wald (Professor of Mathemutical Statistics, Co- 
lumbia University). New York 16: John Wiley & Sons, Inc. (440 Fourth Ave.), 
1947. Pp. xii, 212. $4.00. Twe reviews follow: 


Review By G. A. BARNARD 
Department of Mathematics, Imperial College of Science 
and Technology, London 


ROFEsSOR Wald, his publishers, and all those associated with the rapid 
gst of this book deserve congratulations. At the beginning of 
1943, the idea underlying sequential analysis had not yet been formulated. 
The end of that year saw the publication, in restricted form, of the basic 
results of Professor Wald’s fundamental researches, and their application to 
various problems of war production and development. In 1944 and 1945 the 
first scientific papers were being published on the new developments; in 
1946 a beginning was made towards a wider dissemination of the practical 
procedures. And now, in 1947, we have a book which gives an authoritative, 
up-to-date account of the whole theory. Such speed and enterprise in the 
spreading of new scientific developments is rare. 

Sequential analysis in a sense is the natural result of the coming of age of 
statistical methods in scientific investigation. Time was when the statistician 
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was almost wholly ignored by experimental scientists. This stage gave way 
to a position where the statistician was allowed to examine the results of ex- 
periments which had already been carried out, to advise on their “signifi- 
cance” or otherwise. The main advance at this time occurred in the field of 
agriculture, where masses of experimental data had been accumulated, but 
where the results were so much influenced by chance factors that the conclu- 
sions to be drawn lay buried in confusion. It was necessary then to draw up 
rules for sorting experimental results into three classes: (a) those fit for the 
waste-paper basket, (b) those which should be put on file, and (c) those 
which were worth following up immediately. The conceptions of 5 per cent 
significance for class (b), and 1 per cent significance for class (c) were ex- 
ceedingly valuable in this connection—so valuable, in fact, that they tended 
to live on and be applied in other situations where they were not appropri- 
ate. At this stage the function of the statistician was not so much to use 
data, as to throw it away—to get rid of the chaff, so that the wheat could be 
attended to. It needed someone with the personality and scientific quality of 
Professor R. A. Fisher to enable statistical methods successfully to go 
through this stage to the next—that in which statistical considerations were 
brought to bear on the advance planning of experiments. Finally, after their 
being brought in at the beginning and at the end, sequential analysis marks 
the entry of statistical considerations into the very process of experimenta- 
tion itself. It is only natural that such close collaboration between experi- 
menter and statistician should bring benefits to both. In particular, by the 
use of sequential methods, the experimenter’s task is made easier, in that 
he has fewer observations to take, and the statisician’s task is made simpler. 

The simplicity of sequential methods, as compared with classical ones, 
arises from the fact that, by doing away with inessential restrictions, and 
posing the questions correctly, sequential methods are able to penetrate more 
closely to the heart of the problem of testing statistical hypotheses. What, 
after all, is a simple statistical hypothesis? What does it do for us? It enables 
us to attach a number to experimental results—the likelihood of such re- 
sults, on the hypothesis in question. The connection between a simple 
statistical hypothesis H and observed results R is entirely given by the 
likelihood, or probability function L(R|H). If we make a comparison be- 
tween two hypotheses, H and H’, on the basis of observed results R, this can 
be done only by comparing the chances of, getting R, if H were true, with 
those of getting R, if H’ were true. Mathematically, if L(R|H)=L, and 
L(R\H’) =L’, then our decision about H and H’, in the light of data R, 
must depend on the value of some function f(L, L’). Furthermore, this func- 
tion f must be a function of the ratio, L’/L, only. (Because, intuitively, we 
can imagine that in addition to observing R, we might have observed some 
irrelevant event, such as the fall of a coin, whose probability is p, independ- 
ent of R. Then the likelihoods on H and H’ would become pL and pL’, and 
since such an irrelevant observation could not affect our decision about H 
and H’, we must have f(pL, pL’) =f(L, L’).) If this ratio is large, L’ is much 
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bigger than L, and we shall be inclined to believe in H’ rather than H, on 
the data given; if the ratio is small, it will be the other way about. While if 
the ratio is nearly 1, this means that the data R gives us little ground for 
preferring either of the two hypotheses. 

It is at this point that sequential analysis poses the question in a more 
natural manner than the classical theory of testing hypotheses. In the classi- 
cal approach, the question is put: Which of the two hypotheses, H or H’, 
should we adopt, on the basis of the data R? As if we were always compelled 
to choose one or other of these two alternatives. Sequential analysis, on the 
other hand, poses the question: Are the data R sufficient ground for adopting 
H, or for adopting H’, or are the data insufficient? In other words, we ask, is 
the likelihood ratio L’/L so large that we can safely accept H’, is it so small 
that we can safely accept H, or is it so near to 1 that we have no safe grounds 
for decision? A rule for answering this question will take the form of fixing 
two numbers, A >1 and B <1, and prescribing that we are to accept H’ if 
the likelihood ratio is greater than A, we are to accept H if the likelihood ra- 
tio is less than B, while we consider the data insufficient if the likelihood ratio 
lies between A and B. We may, for example, take A to be 1000, and B to 
be 1/100; what would be the consequences of such a choice? 

As far as any one particular set of data are concerned, the consequences 
of a particular choice of A and B will, of course, be simply that we shall either 
arrive at a decision, or we shall not arrive at a decision; and our decision, if 
we make one, may be either right or wrong. This is all we can say about one 
particular case. We can only say more, if we regard the original data of our 
one particular case as one member of a real or imaginary set of similar cases; 
we shall then be able to make probability statements about the relative 
frequencies of the various types of consequence. It has become customary 
to refer to two such sets of “similar” cases. Both sets consist of imaginary 
“repetitions” of the sampling procedure which led to the original data, and 
in both sets the number of observations taken is customarily the same as 
the number of observations in the original data. The two sets differ in that, 
in the first set, it is supposed that the hypothesis H is always true, while in 
the second set it is supposed that the hypothesis H’ is always true. We can 
then tabulate the probabilities, or relative frequencies, as follows: 











Ist set 2nd set 
(H true) (H’ true) 
Accept H: r w’ 
Accept H’: w 
No decision: u os’ 





Thus r represents the relative frequency with which our choice of A and B 
leads to acceptance of H, in the first set of imaginary repetitions of the 
experiment, while w’ measures the same relative frequency in the second 
set of imaginary repetitions. r and r’ measure relative frequencies of cases 
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where the right decision is made, while w and w’ measure relative frequencies 
of wrong decisions. Now we know that r, w, and u are all positive (or zero), 
and that their sum is 1. Similarly with r’, w’ and u’. Further, if we have taken 
A =1000, B =1/100 for all decisions in the set, we know that r must be at 
least 1000 times as large as w’, and that r’ must be at least 100 times as large 
as w. Since neither r nor r’ can exceed 1, it follows that w’ cannot exceed 
1/1000, and w cannot exceed 1/100. Thus we are entitled to say that, by 
following our rule, we ensure that there is only a small chance that any of 
our experiments will lead to a wrong conclusion. But we are not entitled 
to say anything yet about the probability that our experiments will lead to a 
right conclusion. In this respect, the situation is parallel to that arising in 
the classical theory, when we apply a significance test to an experimental 
result. Such a test guarantees against wrongly rejecting the null hypothesis, 
but in itself it makes no guarantee about rightly rejecting it. In our case, 
we know that r=1—w-—u, and that r’ =1 —w’ —u’, and since w is less than 
0.01 we can say that r is greater than 0.99—u. But, in order to say more, 
we need to know something about u. 

It is at this stage that the notion of a sequential procedure of experimenta- 
tion is introduced. What has been said about the chances of wrong decisions 
etc. can be applied to data already collected, to cases in which the experiment 
in question has already been done, or to cases in which the number of ob- 
servations to be taken is fixed in advance. But it need not be; the only 
property of the data which was used in deriving the conclusions about 
risks of error was the likelihood ratio. We did, in fact, refer our data to an 
imaginary class in which the number of observations was constant, but this 
was only because it was customary; we need not have done so. In fact, in the 
full sequential procedure, we pledge ourselves in advance to go on with our 
experiments until a decision for H or for H’ is reached. In this way, we re- 
duce u and u’ to zero; and then we can say that, with our choice of A and 
B, r will be at least 0.99, while r’ will be at least 0.999. If by some chance, 
we are not able to pledge ourselves to go on to the bitter end, until a decision 
for H or for H’ is reached, but we are able to take samples large enough to 
ensure that u and u’ do not exceed, say, 0.01, then we can say that r will be 
at least 0.98, while r’ will be at least 0.989. 

If instead of wishing to decide between two hypotheses, we wished to de- 
cide between three, H, H’ and H’’, say, we could argue along similar lines. 
If L, L’ and L”’ are, respectively the likelihoods of data R on the three 
hypotheses, then our decision should be based on the value of some homo- 
geneous function f(L, L’, L’’), of degree zero. We might choose to leave 
matters undecided unless one of the three lixelihoods was at least 100 times 
as large as either of the other two. Then an argument similar to that given 
would enable us to conclude that, in any case, the chance of wrongly choosing 
would be less than 1/100, while, if we were able to guarantee a decision of 
some kind, by taking sufficient data, we should be able to say that the 
chance of rightly choosing would be at least 98/100, 








662 AMERICAN STATISTICAL ASSOCIATION 


Almost all the test situations which have been dealt with in the classical 
theory of testing hypotheses, or in the theory of sampling inspection, can be 
reduced to the form of tests of two or more simple hypotheses against each 
other. For example, the situation dealt with by the ¢ test, in the classical the- 
ory, may be treated in several ways, each way more or Jess appropriate to a 
suitable set of practical circumstances. If we wish to know whether the mean 
of a normal population is zero, as opposed to d times its standard deviation 
(unknown), we can argue that since the problem as stated is invariant under 
change of scale, so must its solution be. And this implies that we must base 
our decision on the value of the ordinary ¢ statistic (the ratio of the sample 
mean to its estimated standard deviation). The two simple hypotheses H 
and H’ then assert, respectively, that the ¢ statistic has the central ¢ distribu- 
tion, and that the ¢ statistic has the non-central ¢ distribution with parameter 
d. The likelihood procedure for these two hypotheses then gives us a test 
applicable to this case. Alternatively, we may prefer to make a non-para- 
metric test; and this can be done by considering simply whether or not the 
successive observations are positive or negative. If the population median is 
zero, then the probability of getting a positive reading is }. We can then test 
this hypothesis against one which says that this probability is p +}. 

The fact that the theory of sequential tests bypasses, as it were, so much 
of standard statistical theory has led Professor Wald to begin his book with 
a brief review of the fundamental ideas of probability. He follows this with 
a concise description of the ideas involved in the Neyman-Pearson theory of 
testing statistical hypotheses, which is then contrasted and compered with 
the general ideas of sequential analysis. The notions of efficiency of test 
procedures, operating characteristic function, and average sample number 
function are defined here. After this comes a very full discussion of the se- 
quential probability (or likelihood) ratio test for deciding between two sim- 
ple hypotheses. Its OC function and ASW function are found, its efficiency 
is compared with that of the classical test procedure, and the effect of trun- 
cation is considered, along with a number of more detailed points. The 
theory is then extended to cover tests of simple and composite hypotheses 
against sets of alternatives, and the sequential analogue of the ¢ test is dis- 
cussed as a special case. This ends Part I of the book. In Part II the general 
theory is applied to important special cases—testing the mean of a bi- 
nomial distribution, testing the equality of means of two binomial dis- 
tributions, the corresponding problem witli a normal distribution with known 
variance, testing the variance of a normal distribution with known and with 
unknown mean. In all these discussions a good deal of practical detail is 
given, though no tables of functions involved in practical calculations are 
included. In Part III, multivalued decisions, where the choice is between 
several alternative hypotheses, are dealt with, and the problems involved 
in the theory of sequential estimation are indicated. Finally, the last quarter 
of the book is taken up with an appendix, in which are found the more 
difficult mathematical proofs omitted from the body of the book, together 
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with an appreciable amount of new material—for example, a discussion of 
the simultaneous testing of the means of several norma! distributions. 

The author says, “An effort has been made to keep the exposition on a level 
that will make most of the book, with the exception of the Appendix, under- 
standable to readers whose mathematical background does not go beyond 
college algebra and a first course in calculus. Some knowledge of probability 
and statistics is desirable for the understanding of the book, though not 
essential.” In the reviewer’s opinion, this statement tends to underestimate 
the level at which the book is written. It is undoubtedly true that much of 
the book could be understood by someone with such elementary mathe- 
matical equipment—in fact, enough, of sufficient importance, to make it 
well worth while for such a person to attempt to work through it. But the 
full impact of the book will be best felt by those with a little more familiarity 
with mathematical ideas. In particular if the work is to be used as a manual 
for practical procedures, a warning should be given that, for example, some 
of the logarithms in the formulae can be common ones, while in others, 
natural logarithms have to be used. More numerical] examples wer'd have 
helped in this connection. 

For the more mathematical reader, the exposition has a clarity and a unity 
which make it exceedingly attractive. There are three main themes. First, 
the probability ratio test for simple hypotheses, which is worked out in full. 
Next, the theme of weight functions in the parameter space is developed, to 
cope with composite cases and multiple decisions. Finally, in the Appendix, 
all the main results flow from the fundamental lemma on characteristic 
functions. Further, for the mathematical reader, a most stimulating number 
of unsolved questions are raised. 

But it is the aesthetically satisfying unity of the exposition which is, 
perhaps, the main defect of the book. In a book such as this, one has the 
opportunity, denied in most scientific papers, of presenting the same matter 
from several points of view. This has not been done here. No mention is 
made of the fact that the sequential procedure in the case of discrete popula- 
tions is exactly analogous to a game of chance. Many of the probability 
problems raised are identical with those solved by the classical founders of 
the subject—de Moivre, Huyghens, Laplace—and it would have been help- 
ful to have dealt with some of the problems by the methods of these writers. 
Again, the analogy of the continuous case with the diffusion problem is not 
mentioned; for physicists, applying sequential methods, this would have 
helped them to gain insight into what was happening. At another point in 
the discussion of the problem of testing the mean of a normal distribution 
against two-sided alternatives, the final result is arrived at in accordance 
with the general theory of weight functions over the parameter space. Yet 
this final result, to a sufficiently close approximation, could have been 
arrived at directly from the one-sided case. We could argue that to test 
whether a mean is zero, against positive or negative alternatives, we could 
simultaneously test the zero mean against positive alternatives, and the 
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zero mean against negative alternatives. Provided we suitably allow for the 
two chances of wrongly accepting the zero mean, we arrive at approximately 
the same test as before. 

There are three points on which this reviewer would register disagreement. 
First, on page 109, Professor Wald persists in an incorrect statement he has 
made earlier, to the effect that the classical test procedure for 2 X2 tables 
(in Fisher’s form for small samples, or in Yates’ form for larger ones) is not 
applicable to cases where the probabilities vary from trial to trial. These 
methods are applicable, exactly, if and only if the proper randomization 
procedure has been carried out—regardless of variations in probabilities. 
The second point is a query, rather than a disagreement. Something seems 
to be wreng with the sequential t-test as given by Professor Wald. Sampling 
experiments seem to show that if this procedure is replaced by a much 
simpler one, in which we use the signs (+ or —) of the observations only, 
then there may be an actual gain in efficiency. Furthermore, if one takes 
the line of argument indicated earlier in this review for this case, involving 
the non-central ¢ distribution, one arrives at a test almost, but not quite, 
identical with Professor Wald’s. Something remains to be cleared up here. 

The third point of disagreement :s rather.fundamental. In his opening 
words, Professor Wald says, “Sequential analysis is a method of statistical 
inference whose characteristic feature is that the number of observations 
required by the procedure is not determined in advance of the experiment.” 
Such a statement would imply that sequential methods of analysis are not 
applicable to such things as long-term agricultural experiments, where ma- 
terial circumstances demand that the number of observations should be 
fixed in advance. But if the argument we have outlined at the beginning 
of this review is sound, it would follow that the characteristic feature of 
sequential, as opposed to classical analysis, is that we allow for the possibility, 
in .ny finite case, of not being able to decide; and that the likelihood or prob- 
ability ratio method of analysis is applicable, not only to the fields in which 
observations can be taken successively, in large numbers of small groups, 
but to all questions where the choice lies beiween a finite number of exclusive 
alternatives. If this latter opinion is correct, it leaves the theory of sequential 
testing almost unaltered; but the sphere of applicability is much wider than 
might otherwise be thought. 


REVIEW BY RoBERT FERBER 
54 West 89 St., New York Ciiy 


ROFESSIONAL Statisticians will welcome this book as a very useful and 

well-written reference work on the theory of one of the latest statistical 
techniques, written by the one who developed the subject and who, con- 
sequently, is most qualified to write such a book. The book brings together 
all of the existing theory on sequential analysis under one cover. 

In an attempt to make the book understandable to those whose mathe- 
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matical training has not gone beyond a first course in calculus, the more 
intricate derivations and analyses are relegated to a 50-page mathematical 
appendix, and an introductory chapter is devoted to a review of the current 
theory of testing hypotheses. Part I, General Theory, proceeds to develop 
the notion of a sequential test (Chapter 2), and of the sequential probability 
ratio test in particular (Chapters 3 and 4). The sequential probability ratio 
test for testing a simple hypothesis is discussed in Chapter 3, and the se- 
quential probability ratio test for testing simple and composite hypotheses 
against sets of alternatives is outlined in Chapter 4. 

Five applications of the sequential probability ratio test are presented 
in Part II, Application of the General Theory to Special Cases, all of which 
will be quite familiar to the reader of the previous material on the subject. 
Very few computational aids and no computational tables are provided. 

The two chapters in Part III present the latest developments on the 
application of the sequential probability ratio test to multiple decision 
problems and to statistical estimation. A general approach is outlined to 
both of these problems though, as the author indicates, extensive work has 
yet to be done before a complete systematic theory is developed for each 
of these two problems. 

The book is written very concisely and systematically. Though occasional 
repetitions do occur (e.g., compare pages 72-73 with pages 78-79), they are 
not likely to prove too irksome to the professional statistician and will be 
welcomed by the beginning student who will probably have enough diffi- 
culty with it as it is. The one unsystematic part of the book, in this re- 
viewer’s opinion, is Chapter 9 where the author fails to state or even men- 
tion the operating characteristic or average sample number formulas for 
testing the mean of a normal distribution in a two-sided alternative. This is 
contrary to the practice followed in the previous four chapters on applica- 
tions. The relatively small number of misprints is a welcome feature of this 
book; probably the most glaring misprint occurs in formula (3:3) on page 38, 
where a > sign has accidentally been used instead of the < sign. 

An unfortunate aspect of this book is the continual reference to sampling 
inspection and to physical science situations for illustrative examples. 
Although it is true that sequential analysis was first developed to aid physical 
science, the fact that the method can equally be applied to problems of social 
science and of commercial research does not seem to have yet gained wide 
recognition, either among mathematica! statisticians or, consequently, 
among practical commercial researchers. This reviewer has yet to see an 
illustration of the application of the method to commercial research in a 
publication on sequential analysis. Until the applicability of sequential 
analysis to these problems is generally recognized, the market for such books 
as the present one is artificially restricted to professional statisticians and to 
physical scientists. 
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Say It With Figures. Hans Zeisel (McCann-Erickson, New York City). New 
York 16: Harper & Brothers (383 Madison Ave.), 1947. Pp. xvii, 250. $3.00. 


Review BY ALBERT B. BLANKENSHi? 
Managing Director of National Analysts, Inc. 
Philadelphia, Pennsylvania 


O THE technician in consumer and opinion research, this book reads 

like a novel. It takes all of those vague tabulation probiems and not only 
brings them out into the open, but shows the direction of their solution. 
Yet with all of the technical problems it covers, it is never dull reading. 
Despite this facile reading, this is not a book for beginners in the field. It 
assumes knowledge of elementary tabulation on the part of the reader. 

The book covers classification of replies, numerical summary, and inter- 
pretation. It is the only attempt that the reviewer has ever seen to spell 
out methods and principles of making tabulation decisions which are so 
necessary before sound statistical treatment can be applied. 

How should reason questions be tabulated? What should be done when a 
question is used to which a person can give more than one reply? How should 
“don’t know’s” and “no answers” be handled in the tabulation? In what 
direction should percentages be run? How should relationships between data 
be analyzed and presented? How can tabulations be made so as to determine 
the presence or absence of cause and effect in correlated data? These and 
related questions are not only raised, but the methods of solution are pre- 
sented. What Zeisel has done—and done well—is to cover in detail all of 
those intermediate tabulation steps which no statistics writer has ever dis- 
cussed. The steps, as written by Zeisel, seem elementary. Actually this is 
merely another way of saying that the book is effectively written. 

After all of this praise, it should certainly be mentioned that the perfect 
book still has not been written. This book has several shortcomings. 

One is the tendency to cversimplify. In general, of course, this makes for 
easy reading. But many problems are more complicated than the author 
implies. The casual reader, not too well trained in the work, may not get the 
implications of difficulty and complexity which he should. This is merely 
another way of saying that the text looks so abbreviated that the author was 
unable to explain adequately many of his points. 

The title, itself, is not too good. It appears that the title was selected 
more for its possible appeal to businessmen than for its description of con- 
tents. 

Another shortcoming is the lack of an introductory section on elementary 
tabulation techniques. While this was not stated as part of the scope of the 
book, it is clear that the novice in the field would be lost in attempting to 
follow the material presented. Several chapters on hand and machine tabu- 
lation, along with visual analysis, would add immeasurably to the volume. 
There would then be available, within one cover, a complete manual on 
tabulation. 
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Perhaps this is merely another way of expressing the hope that this book 
is but the first in a series on the various steps in conducting consumer and 
opir..n research. The texts which offer general coverage of the field cannot 
go into necessary detailed explanations of the steps of defining the problem, 
determining the method of collecting the data, devising and testing the 
questionnaire, securing a cross section of respondents, collection of the 
data, summary of results, and report writing. The advanced student has 
no intermediate step from the elementary over-all book in the field, to the 
postgraduate offerings of the technical journals and the book edited by 
Cantril. Surely Zeisel’s book then marks an important milestone in this 
intermediate zone. If the other fields are to be similarly covered, Zeisel has 
set a standard for scope and readability that will be difficult to match. 

This book should be useful in the classroom, for the undergraduate student 
who has completed a first course in commercial research. It will also be of 
some value to the beginner in a practical research situation, though he will 
have to learn elementary tabulation principles elsewhere. The book should 
also be useful to “postgraduates” in the field who are honest enough to admit 
that they have never summarized their thinking on principles of tabulation. 

This book should enjoy a wide audience. If it does, it should have the 
effect of completing the transition in commercial research from mere nose- 
counting to qualitative analysis. 
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